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We review the recent fast progress in statistical physics of evolving networks. Interest has focused 
mainly on the structural properties of random complex networks in communications, biology, social 
sciences and economics. A number of giant artificial networks of such a kind came into existence 
recently. This opens a wide field for the study of their topology, evolution, and complex processes 
occurring in them. Such networks possess a rich set of scaling properties. A number of them are 
scale- free and show striking resilience against random breakdowns. In spite of large sizes of these 
networks, the distances between most their vertices are short — a feature known as the "small- 
world" effect. We discuss how growing networks self-organize into scale- free structures and the role 
of the mechanism of preferential linking. We consider the topological and structural properties of 
evolving networks, and percolation in these networks. We present a number of models demonstrating 
the main features of evolving networks and discuss current approaches for their simulation and 
analytical study. Applications of the general results to particular networks in Nature are discussed. 
We demonstrate the generic connections of the network growth processes with the general problems 
of non-equilibrium physics, econophysics, evolutionary biology, etc. 



[II 



CONTENTS 

Introduction 
Historical background 

Structural characteristics of evolving networks 

[II A . Degree 
[II b|. Shortest path 

Clustering coefficient 
[II D|. Size of the giant component 

Other many-vertex characteristics 



[HE 



Notions of equilibrium and non-equilibrium networks 
Evolving networks in Nature 

V A\ Networks of citations of scientific papers 

V B . Networks of collaborations 

V Q. Communications networks, the WWW and Internet 
Structure of the Internet 
Structure of the WWW 



VC1 



VC2 



VE 



VD Biological networks 

V D 1 . Structure of neural networks 
V D 2 . Networks of metabolic reactions 
V D 3 . Protein networks 

V D 4 . Ecological and food webs 

V D 5 . Word Web of human language 



VF 



Electronic circuits 
Other networks 



VII A 



VII B 



VII C 



ficient 



VIII 



Classical random graphs, the Erdos-Renyi model 
Small-world networks 

The Watts-Strogatz model and its variations 
The smallest-world network 

Other possibilities to obtain large clustering coef- 

2C 

Growing exponential networks 2C 
Scale-free networks 21 



IX A 



linking 



KB 



IXC 



IX D 



I IXE; 
fXF 



Barabasi- Albert model and the idea of preferential 

|21 

Master equation approach 
A simple model of scale-free networks 
Scaling relations and cutoff 
Continuum approach 

More complex models and estimates for 



WWW 



22 
24 
26 
50 
the 
30 

Types of preference providing scale-free networks 33 
"Condensation" of edges 35 
. Correlations and distribution of edges over network 38 
Accelerated growth of networks 39 
Decaying networks 42 
Eigenvalue spectrum of the adjacency matrix 42 
Scale-free trees 43 
Xt Non-scale-free networks with preferential linking 43 
XI. Percolation on networks 44 
XI A. Theory of percolation on undirected equilibrium 
ne twork s I ! 
XI B Percolation on directed equilibrium networks 
XI C , Failures and attacks 
XI D . Resilience against random breakdowns 
XI E . Intentional damage 
XI F . Disease spread within networks 
XI Gj. Anomalous percolation on growing networks 



XII 



XII A 



XII B 



Growth of networks and self-organized criticality 
Linking with sand-pile problems 
Preferential linking and the Simon model 



XII C 



Multiplicative stochastic models and the general- 



ized Lotka-Volterra equation 
XIII, Concluding remarks 



Acknowledgements 
References 



1 



I. INTRODUCTION 



II. HISTORICAL BACKGROUND 



The Internet and World Wide Web are perhaps the 
most impressive creatures of our civilization (Baran 
1964). Their influence on us is incredible. They are 
part of our life, of our world. Our present and our fu- 
ture are impossible without them. Nevertheless, we know 
much less about them than one may expect. We know 
surprisingly little of their structure and hierarchical orga- 
nization, their global topology, their local properties, and 
various processes occurring within them. This knowledge 
is needed for the most effective functioning of the Inter- 
net and WWW, for ensuring their safety, and for utilizing 
all of their possibilities. Certainly, the understanding of 
such problems is a topic not of computer science and 
applied mathematics, but rather of non-equilibrium sta- 
tistical physics. 

In fact, these wonderful communications nets |2]-[ic|l 
are only particular examples of a great class of evolv- 
ing networks. Numerous networks, e.g^, collaboration 
networks jLy-Q, public relations nets |16|-p0fl, citations 
of scientific papers [2l] - [27| , some industrial networks 
|^l| , ^2|j28| , transportation networks [^,0, nets of rela- 
tions between enterprises and agents in financial markets 
[31 1, telephone call graphs |32|, many biological networks 
[33 food and ecological webs p^-[5^], etc., belong to 
it. The finiteness of these networks sets serious restric- 
tions on extracting useful experimental data because of 
strong size effects and, often, insufficient statistics. The 
large size of the Internet and WWW and their extensive 
and easily accessible documentation allow reliable and 
informative experimental investigation of their structure 
and properties. Unfortunately, the statistical theory of 
neural networks |5^,Q seems to be rather useless for the 
understanding of problems of the evolution of networks, 
since this advanced theory does not seriously touch on the 
main question arising for real networks - how networks 
becomes specifically structured during their growth. 

Quite recently, general features of structural 
organization of such networks were discovered 



|&ey . It has become clear that their 
complex scale-free structure is a natural consequence of 
the principles of their growth. Some simple basic ideas 
have been proposed. Self-organization of growing net- 
works and processes occurring within them have been re- 
lated p^-|63] | to corresponding phenomena (growth phe- 
nomena [p4|, self-organization [ 35 — 37 1 and self-organized 
criticality ||68|-|70|| , percolation [71-73], localization, etc.) 
being studied by physicists for a long time. 

The goal of our paper is to review the recent rapid 
progress in understanding the evolution of networks using 
ideas and methods of statistical physics. The problems 
that we discuss relate to computer science, mathemat- 
ics, physics, engineering, biology, economy, and social 
sciences. Here, we present the point of view of physi- 
cists. To restrict ourselves, we do not dwell on Boolean 
and neural networks. 
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The structure of networks has been studied by math- 
ematical graph theory |74]-|7^|. Some basic ideas, used 
later by physicists, were proposed long ago by the in- 
credibly prolific and outstanding Hungarian mathemati- 
cian Paul Erdos and his collaborator Renyi j77],|78| . Nev- 
ertheless, the most intriguing type of growing networks, 
which evolve into scale-free structures, hasn't been stud- 
ied by graph theory. Most of the results of graph the- 
ory ||79| , p0f are related to the simplest random graphs 
with Poisson distribution of connections [77 7§] (classi- 
cal random graph). Moreover, in graph theory, by defini- 
tion, random graphs are graphs with Poisson distribution 
of connections (we use this term in a much more wide 
sense). Nevertheless, one should note the very important 
results obtained recently by mathematicians for graphs 
with arbitrary distribution of connections (8^j8^| . 

The mostly empirical study of specific large random 
networks such as nets of citations in scientific literature 
has a long history pl|-|2^]. Unfortunately, their limited 
sizes did not allow to get reliable data and describe their 
structure until recently. 

Fundamental concepts such as functioning and practi- 
cal organization of large communications networks were 
elaborated by the "father" of the Internet, Paul Baran, 
[jjj . Actually, many present studies are based on his orig- 
inal ideas and use his terminology. What is the optimal 
design of communications networks? How may one en- 
sure their stability and safety? These and many other 
vital problems were first studied by P. Baran in a prac- 
tical context. 

By the middle of 90's, the Internet and the WWW had 
reached very large sizes and continued to grow so rapidly 
that intensively developed search engines failed to cover 
a great part of the WWW A clear knowl- 

edge of the structure of the WWW has become vitally 
important for its effective operation. 

The first experimental data, mostly for the simplest 
structural characteristics of the communications net- 
works, were obtained in 1997-1999 § |]9C]]9l|]. Distri- 
butions of the number of connections in the networks 
and their surprisingly small average shortest-path lengths 
were measured. A special role of long-tailed, power-law 
distributions was revealed. After these findings, physi- 
cists started intensive study of evolving networks in var- 
ious areas, from communications to biology and public 
relations. 



III. STRUCTURAL CHARACTERISTICS OF 
EVOLVING NETWORKS 



Let us start by introducing the objects under discus- 
sion. The networks that we consider are graphs consist- 
ing of vertices (nodes) connected by edges (links). Edges 
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may be directed or undirected (leading to directed and 
undirected networks, relatively). For definition of dis- 
tances in a network, one sets lengths of all edges to be 
one. 

Here we do not consider networks with unit loops 
(edges started and terminated at the same vertex) and 
multiple edges, i.e., we assume that only one edge may 
connect two vertices. (One should note that multiple 
edges are encountered in some collaboration networks 
E3. Pairs of opposing edges connect some vertices in 
the WWW, in networks of protein-protein interactions, 
and in food webs. Also, protein-protein interaction nets 
and food webs contain unit loops (see below). Nets with 
"weighted" edges are discussed in Ref. Q.) 

The structure of a network is described by its adja- 
cency matrix, B, whose elements consist of zeros and 
ones. An element of the adjacency matrix of a network 
with undirected edges, b^, is 1 if vertices /i and v are 
connected, and is otherwise. Therefore, the adjacency 
matrix of a network with undirected edges is symmetri- 
cal. For a network with directed edges, an element of the 
adjacency matrix, b^, equals 1 if there is an edge from 
the vertex /z to the vertex u, and equals otherwise. 

In the case of a random network, an adjacency matrix 
describes only a particular member of the entire statis- 
tical ensemble of random graphs. Hence, what one ob- 
serves is only a particular realization of this statistical 
ensemble and the adjacency matrix of this graph is only 
a particular member of the corresponding ensemble of 
matrices. 

The statistics of the adjacency matrix of a random net- 
work contains complete information about the structure 
of the net, and, in principle, one has to study just the 
adjacency matrix. Generally, this is not an easy task, so 
that, instead of this, only a very restricted set of struc- 
tural characteristics is usually considered. 



A. Degree 

The simplest and the most intensively studied one- 
vertex characteristic is degree. Degree, k, of a vertex is 
the total number of its connections. (In physical litera- 
ture, this quantity is often called "connectivity" that has 
a quite different meaning in graph theory. Here, we use 
the mathematically correct definition.) In-degree, hi, is 
the number of incoming edges of a vertex. Out-degree, k 
is the number of its outgoing edges. Hence, k = ki + k a . 
Degree is actually the number of nearest neighbors of a 
vertex, z\ . Total distributions of vertex degrees of an en- 
tire network, P(ki, k ) — the joint in- and out-degree dis- 
tribution, P(k) — the degree distribution, Pj(fcj) — the 
in-degree distribution, and P (k ) — the out-degree dis- 
tribution — are its basic statistical characteristics. Here, 

P(k) = ^P(A; i ,A;-fc i )=^P(A;-fc ,fe ), 



Pi(ki) — ^ P(ki, k D ) , 
Po(ko) = ^P(fc,,fc ). 

ki 



(1) 



For brevity, instead of Pi(ki) and P (k ) we usually use 
the notations P(fcj) and P(k ). If a network has no con- 
nections with the exterior, then the average in- and out- 
degree are equal: 



ki 



— ^ ^ kiP(ki 

ki,k a 



— ^ y k P[ki 

ki,k a 



k ) , 



(2) 



Although the degree of a vertex is a local quantity, 
we shall see that a degree distribution often determines 
some important global characteristics of random net- 
works. Moreover, if statistical correlations between ver- 
tices are absent, P{ki,k Q ) totally determines the struc- 
ture of the network. 



B. Shortest path 

One may define a geodesic distance between two ver- 
tices, [i and v, of a graph with unit length edges. It is 
the shortest-path length, l llVl from the vertex \i to the 
vertex v. If vertices are directed, is not necessary 
equal to l v ^. It is possible to introduce the distribution 
of the shortest-path lengths between pairs of vertices of 
a network and the average shortest-path length I of a 
network. The average here is over all pairs of vertices 
between which a path exists and over all realizations of 
a network. 

£ is often called the "diameter" of a network. It deter- 
mines the effective "linear size" of a network, the average 
separation of pairs of vertices. For a lattice of dimension 
d containing N vertices, obviously, I ~ N x / d . In a fully 
connected network, £ = 1. One may roughly estimate 
I of a network in which random vertices are connected. 
If the average number of nearest neighbors of a vertex 
is Zi, then about zf vertices of the network are at a 
distance I from the vertex or closer. Hence, N ~ z\ 
and then I ~ IniV/lnzi, i.e., the average shortest-path 
length value is small even for very large networks. This 
smallness is usually referred to as a small-world effect 

One can also introduce the maximal shortest-path 
length over all the pairs of vertices between which a path 
exists. This characteristic determines the maximal ex- 
tent of a network. (In some papers the maximal shortest 
path is also referred to as the diameter of the network, 
so that we avoid to use this term.) 



C. Clustering coefficient 

For the description of connections in the environ- 
ment closest to a vertex, one introduces the so-called 
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clustering coefficient. For a network with undirected 
edges, the number of all possible connections of the 
nearest neighbors of a vertex fi (z[ nearest neigh- 
bors) equals z[^\z[^ — l)/2. Let only y^ of them 
be present. The clustering coefficient of this vertex, 
CM = y (^/[ z [^\z { / ) - l)/2], is the fraction of exist- 
ing connections between nearest neighbors of the vertex. 
Averaging over all vertices of a network yields the 
clustering coefficient of the network, C. The clustering 
coefficient is the probability that two nearest neighbors 
of a vertex are nearest neighbors also of one another. The 
clustering coefficient of the network reflects the "cliquish- 
ness" of the mean closest neighborhood of a network ver- 
tex, that is, the extent to which the nearest neighbors of 
a vertex are the nearest neighbors of each other . One 
should note that the notion of clustering was introduced 
in sociology [ 
From anot 



ler point of view, C is the probability that 
if a triple of vertices of a network is connected together 
by at least two edges then the third edge is also present. 
One can check that C/3 is equal to the number of triples 
of vertices connected together by three edges divided by 
the number of all connected triples of vertices. 

Instead of it is equally possible to use an- 

other related characteristic of clustering, D^"> = (z[ + 
y{^)/[(z[^ + l)z{^/2], that is, the fraction of existing 
connections inside of a set of vertices consisting of the 
vertex fi and all its nearest neighbors. plays the 

role of local density of linkage. CW and flW are con- 
nected by the following relations: 



£)0) _ _|_ 



+ 1 



1 



(1-CW), 



(1-flW). 



(3) 



In a network in which all pairs of vertices are connected 
(the complete graph) C = D = 1. For tree- like graphs, 
C = 0. In a classical random graph C = M/[N(N — 
l)/2] = zt/{N - 1), D = M/[N(N + l)/2] = z x /{N + I). 
Here, N is the total number of vertices of the graph, 
M is the total number of its edges, and z\ is an aver- 
age number of the nearest neighbors of a vertex in the 
graph, M — Z\N/2. In an ordered lattice, < C < 1 
depending on its structure. Note that < C < I but 
< 2/{z x + 1)<D<1. 



D. Size of the giant component 

Generally, a network may contain disconnected parts. 
In networks with undirected edges, it is easy to intro- 
duce the notion corresponding to the percolating clus- 
ter in the case of disordered lattices. If the relative size 
of the largest connected cluster of vertices of a network 
(the largest connected component) approaches a nonzero 



value when the network is grown to infinite size, the sys- 
tem is above the percolating threshold, and this cluster 
is called the giant connected component of the network. 
In this case, the size of the next largest cluster, etc. are 
small compared to the giant connected component for a 
large enough network. Nev ertheless, size effects are usu- 
ally strong (see Sec. XI C| ), and for accurate measure- 



ment of the size of the giant connected component, large 
networks must be used. 

One may generalize this notion for networks with di- 
rected edges. In this case, we have to consider a cluster 
of vertices from each of that one can approach any vertex 
of this cluster. Such a cluster may be called the strongly 
connected component. If the largest strongly connected 
component contains a finite fraction of all vertices in the 
large network limit, it is called the giant strongly con- 
nected component. Connected clusters obtained from a 
directed network by ignoring directions of its edges are 
called weakly connected components, and one can define 
the giant weakly connected component of a network. 



E. Other many-vertex characteristics 

One can get a general picture of the distribution of 
edges between vertices in a network considering the av- 
erage elements of the adjacency matrix, (here, the 
averaging is over realizations of the evolution process, if 
the network is evolving, or over all configurations, if it is 
static) although this characteristic is not very informa- 
tive. 

A local characteristic, degree, k = fci = Z\ can be eas- 
ily generalized. It is possible to introduce the number of 
vertices at a distance equal 2 or less from a vertex, &2, 
the number of second neighbors, z% = fca — ki, etc. Gen- 
eralization of the clustering coefficient is also straightfor- 
ward: one has to count all edges between n-th nearest 
neighbors. 

One may consider distributions of these quantities and 
their average values. Often, it is possible to fix a vertex 
not by its label, [i but only by its in- and out-degrees, 
therefore, it is reasonable to introduce the probability 
P(ki,k ;k' i ,k' ) that a pair of vertices - the first vertex 
with the in- and out-degrees ki and k a and the second one 
with the in- and out-degrees k[ and k' a - are connected 
by a directed edge going out from the first vertex and 
coming to the second one p4] , p5| . 

It is easy to introduce a similar quantity for networks 
with undirected edges, namely the distribution P(k\, fo) 
of the degrees of nearest neighbor vertices. This distribu- 
tion indicates correlations between the degrees of nearest 
neighbors in a network: if P(fc i i k£u does not factorize, 
these correlations are present |9J,|95|]. Unfortunately, it 
is hard to measure such distributions because of the poor 
statistics. However, one may easily observe these correla- 
tions studying a related characteristic - the dependence 
k nn (k) of the average degree of the nearest neighbors k nn 
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on the degree k of a vertex |96| . 

Similarly, it is difficult to measure a standard joint in- 
and out- degree distribution P{ki 1 k Q ). However, one may 
measure the dependences ki(k ) of the average in-degrees 
ki for vertices of the out-degree k a and k a (ki) of the av- 
erage out-degrees k for vertices of the in-degree ki. 

One may also consider the probability, P n (ki, k n ), that 
the number of vertices at a distance n or less from a ver- 
tex equals k n , if the degree of the vertex is fci, etc. Some 
other many-node characteristics will be introduced here- 
after. 



IV. NOTIONS OF EQUILIBRIUM AND 
NON-EQUILIBRIUM NETWORKS 

From a physical point of view, random networks may 
be "equilibrium" or "non-equilibrium" . Let us introduce 
these important notions using simple examples. 

(a) An example of an equilibrium random netwo rk: A 
classical undirected random graph |77| , |78|] (see Sec. VI). 

It is defined by the following rules: 

(i) The total number of vertices is fixed. 

(ii) Randomly chosen pairs of vertices are connected 
via undirected edges. 

Vertices of the classical random graph are statistically 
independent and equivalent. The construction procedure 
of such a graph may be thought of as the subsequent ad- 
dition of new edges between vertices chosen at random. 
When the total number of vertices is fixed, this procedure 
obviously produces equilibrium configurations. 

(b) The example of a non-equilibrium random network: 
A simple random graph growing through the simultane- 
ous additionof vertices and edges (see, e.g., Ref. [203.204 
and Sec 



XI G) 



Definition of this graph: 

(i) At each time step, a new vertex is added to the 
graph. 

(ii) Simultaneously, a pair (or several pairs) of ran- 
domly chosen vertices is connected. 

One sees that the system is not in equilibrium. Edges 
are inhomogeneously distributed over the graph. The 
oldest vertices are the most connected (in statistical 
sense), and degrees of new vertices are the smallest. If, 
at some moment, we stop to increase the number of ver- 
tices but continue the random addition of edges, then the 
network will tend to an "equilibrium state" but never 
achieve it. Indeed, edges of the network do not disap- 
pear, so the inhomogeneity survives. An "equilibrium 
state" can be achieved only if, in addition, we allow old 
edges to disappear from time to time. 

The specific case of equilibrium networks with a Pois- 
son degree distribution was actually the main object of 
graph theory over more than forty years. Physicists 
have started the study of non-equilibrium (growing) net- 
works. The construction procedure for an equilibrium 
graph with an arbitrary degree distribution P(k) was 



proposed by Molloy and Reed |H],|2| ( n °t e that this pro- 
cedure cannot be considered as quite rigorous): 

(a) To the vertices {fj,} of the graph ascribe degrees 
{k^} taken from the distribution P(k). Now the graph 
looks like a family of hedgehogs: each vertex has kj stick- 
ing out quills (see Fig. |l| (a)). 

(b) Connect at random ends of pairs of distinct quills 
belonging to distinct vertices (see Fig. [j] (b)). 



V a) 



b) 




FIG. 1. The construction procedure for an equilibrium 
random graph with preset arbitrary degree distribution P(k). 
(a) Degrees {fc M } taken from the distribution are ascribed to 
the vertices {/i}. (b) Pairs of random ends sticking out of 
different vertices are connected. 

The generalization of this construction procedure to 
directed equilibrium graphs with arbitrary joint in- and 
out-degree distributions P(ki,k ) is straightforward. 

While speaking about random networks we should keep 
in mind that a particular network we observe is only one 
member of a statistical ensemble of all possible realiza- 
tions. Hence when we speak about random networks, 
we actually mean statistical ensembles. The canonical 
ensemble for an undirected network with N vertices has 
2 N i N ~ 1 )/ 2 members, i.e. realizations (recall that unit 
loops and multiple edges are forbidden). Each member 
of the ensemble is a distinct configuration of edges taken 
with some statistical weight. A rigorous definition of a 
random network must contain a set of statistical weights 
for all configurations of edges. A grand canonical ensem- 
ble of random graphs may be obtained using standard 
approaches of statistical mechanics. The result, namely 
the statistical ensemble of equilibrium random networks, 
is completely determined by the degree distribution. 

The above rather heuristic procedure of Molloy and 
Reed provides only a particular realization of the equi- 
librium graph. Unfortunately, this procedure is not very 
convenient for the construction of the entire statistical 
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ensemble, at least, for finite-size networks. Surprisingly, 
the rigorous construction of the statistical ensemble of 
equilibrium random graphs was made only for classical 



random graphs (see Sec. VI), and the problem of strict flcW node. 



formal construction of the statistical ensemble of equilib- 
rium random graphs with a given degree distribution is 
still open. (However, see Ref. |97]] for the construction 
procedure for the statistical ensemble of trees) . 

It is possible to construct an equilibrium graph in an- 
other way than the Molloy-Reed procedure. Suppose 
one wants to obtain a large enough equilibrium undi- 
rected graph with a given set of vertex degrees k^, where 
u = 1, . . . , N, Let us start from an arbitrary configura- 
tion of edges connecting these vertices of degree fc M . We 
must "equilibrate" the graph. For this: 

(a) Connect a pair of arbitrary vertices (e.g., 1 and 2) 
by an additional edge. Then the degrees of these vertices 
increase by one (k[ = ki + 1 and k' 2 = k 2 + 1). 

(b) Choose at random one of edge ends attached to 
vertex 1 and rewire it to a randomly chosen vertex 3. 
Choose at random one of edge ends attached to vertex 
2 and rewire it to a randomly chosen vertex 4. Then 



old net 



&4 



1. 



k'( = fci, k'2 = k 2 and k% = k 3 + 1 

(c) Repeat (b) until equilibrium is reached. 
Only two vertices of resulting network have degrees 
greater (by one) than the given degrees k^. For a large 
network, this is non-essential. If, during our procedure, 
both the edges under rewiring are turned to be rewired to 
the same vertex, then, at the next step, one may rewire a 
pair of randomly chosen edges from this vertex. Another 
procedure for the same purpose is described in Ref. [p8| . 

The notion of the statistical ensemble of growing net- 
works may also be introduced in a natural way. This 
ensemble includes all possible paths of the evolution of a 
network. 



V. EVOLVING NETWORKS IN NATURE 



In the present section we discuss some of the most 
prominent large networks in Nature starting with the 
most simply organized one. 



A. Networks of citations of scientific papers 

The vertices of these networks are scientific papers, the 
directed edges are citations. The growth process of the 
citation networks is very simple (see Fig. ||). Almost 
each new article contains a nonzero number of references 
to old ones. This is the only way to create new edges. 
The appearance of new connections between old vertices 
is impossible (one may think that old papers are not up- 
dated). The number of references to some paper is the 
in-dcgrec of the corresponding vertex of the network. 



FIG. 2. Scheme of the growth of citation networks. Each 
new paper contains references to previously published articles 
or books. It is assumed that old papers are not updated, so 
new connections between old vertices are impossible. 

The average number of references in a paper is of the 
order of 10 1 , so such networks are sparse. In Ref. j27j, the 
data from an ISI database for the period 1981 - June 1997 
and citations from Phys. Rev. D 11-50 (1975-1994) were 
used to find the distributions of the number of citations, 
i.e., the in-degree distributions. The first network con- 
sists of 783 339 nodes and 6 716 198 links, the maximum 

number of citations is k i = 8 904. The second net- 
work contains 24 296 nodes connected by 351872 links, 
and its maximum in-degree equals 2 026. The out-degree 
is rather small, so the degree distribution coincides with 
the in-degree one in the range of large degree. 

Unfortunately, the sizes of these networks are not suf- 
ficiently large to find a conclusive functional form of the 
distributions. In Ref. p7j , both distributions were fitted 
by the k~ 3 dependence. The fitting by the dependence 
(ki +const)~ 7 was proposed in Ref. |99{ . The exponents 
were estimated as 7 = 2.9 for the ISI net and 7 = 2.6 
for t he Phys. Rev. D citations. Furthermore, in Ref. 
[100 1, the large in-degree part of the in-degree distribu- 
tion obtained for the Phys. Rev. D citation graph was 
fitted by a power law with the exponent 7 = 1.9 ± 0.2. 
It was found in the same paper that the average number 
of references per paper increases as the citation graphs 
grow. The out-degree distributions (the distribution of 
number of references in papers) show exponential tails. 
The factor in the exponential depends on whether or not 
journals restrict the maximal number of pages in their 
papers. 

It is possible to estimate roughly the values of the ex- 
ponent knowing the size N of the network and the cut- 
off k cut of the distribution, 7 ps 1 + In N/ In k cu t (see 
Sec. IX D| ). Using the maximal number of citations as 
the cut-offs, the authors of the papers (9^j9^] got the 
estimations 7 = 2.5 for the ISI net and 7 = 2.3 for 
Phys. Rev. D. Moreover, they indicated from similar 
estimation that these data are also consistent with the 
k~ y exp[— const k\~ v \ form of the distribution if one sets 
y = 0.9 for the ISI net and y = 0.7 for Phys. Rev. D. 

In Ref. |^6| , the very tail of a different distribution was 
studied. The ranking dependence of the number of cita- 
tions to the 1 120 most cited physicists was described by a 
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stretched exponential function. Of course, the statistics 
of citations collected by authors necessarily differ from 
the statistics of the citations to papers. Also, the form of 
the tail of the distribution should be quite different from 
its main part. 

In Ref. [101], the process of receiving of citations by pa- 
pers in a growing citation network was empirically stud- 
ied. 1 736 papers published in Physical Review Letters 
in 1988 were considered, and the dynamics of receiving 
83 252 citations was analysed. It was demonstrated that 
new citations (incoming edges) are distributed among pa- 
pers (vertices) with probability proportional to degree of 
vertices. This indicates that linear preferential attach- 
ment mechanism operates in this citation graph. 



B. Networks of collaborations 

The set of collaborations can be represented by the 
bipartite graph containing two distinct types of vertices 
— collaborators and acts of collaborations (see Fig. ||[a) 
PH . Collaborators connect together through collabora- 
tion acts, so in this type of a bipartite graph, direct con- 
nections between vertices of the same kind are absent. 
Edges are undirected. For instance, in the scientific col- 
laboration bipartite graphs, one kind of vertices corre- 
sponds to authors and the other one is scientific papers 
[^3|,0. In movie actor graphs, these two kinds of vertices 
are actors and films, respectively O , |28| ,102|. 





FIG. 3. A bipartite collaboration graph (a) and one of its 
one- mode projections (b) |3l| . Collaborators are denoted by 
empty circles, the filled circles depict acts of collaboration. 

Usually, instead of such bipartite graphs, their far less 
informative one-mode projections are used (for the pro- 
jection procedure, see Fig. [jb). In particular, one can 
directly connect vertices-collaborators without indicat- 
ing acts of collaboration. Note that the clustering coef- 
ficients of such one-mode projections are large because 
each act of collaboration simultaneously creates a num- 
ber of highly connected nearest neighbors. 

Note that, in principle, it is possible to introduce mul- 
tiple edges if there were several acts of collaboration be- 



tween the same collaborators. Also, one can consider 
weighted edges accounting for reduction of the "effect" of 
collaboration between a pair of collaborators when sev- 
eral participants are simultaneously involved JlJ| . We do 
not consider these possibilities here. 

Collaboration networks are well documented. For ex- 
ample, in Rcfs. [ fil]fi1| ], the movie actor one-mode graph 
consisting of 225 226 actors is considered. The average 
degree is k = 61, the average shortest path equals 3.65 
that is close to the corresponding value 3.00 for the clas- 
sical random graph with the same k. The clustering coef- 
ficient is large, C = 0.79 (for the corresponding classical 
random graph it should be 0.00027). Note that in Ref. 
J6l| , another value, C = 0.199, for the clustering coeffi- 
cient of a movie actor graph is given. 

The distribution of the degree of vertices (number of 
collaborators) in the movie actor network (N — 212 250 
and k — 28.78) was observed to be of a pow er-law form 
with the exponent 7 = 2.3 [[35). In Ref. 102 1 , the degree 
distribution was fitted by the (k + const)^ 1 dependence 
with the exponent 7 ~ 3.1. Notice that, in Refs. |5[| and 
[102 1, TV series were excluded from the dataset. The 



reason for this is that each series is considered in the 
database as a single movie with, sometimes, thousands 
of actors. In Ref. |2^] the full dataset, including series, 
was used, which has yielded exponential form of the de- 
gree distribution (for statistical analysis, a cumulative 
degree distribution was used). 

Similar graphs for members of the boards of directors 
of the Fortune 1 000 companies, for authors of several 
huge electronic archives, etc. were also studied pl||l4] , |6"l[ . 
Distributions of numbers of co-directors, of collaborators 
that a scientist has, etc. were considered in Ref. [ pi] ]. Dis- 
tributions display a rather wide variance of forms, and it 
is usually hardly possible to observe a pure power-law 
dependence. 

One can find data on structure of large scientific col- 
laboration networks in Refs. The largest one 
of them, MEDLINE, contains 1520 254 authors with 
18.1 collaborations per author. The clustering coefficient 
equals 0.066. The giant connected component covers 93% 
of the network. The size of the second largest component 
equals 49, i.e., is of the order of In TV. The average short- 
est path is equal to 4.6 that is close to the corresponding 
classical random graph with the same average degree. 
The maximal shortest path is several times higher than 
the average shortest one and equals 24. These data are 
rather typical for such networks. 

Mathematical (M) (70 975 different authors and 70 901 
published paper) and neuro-science (NS) (209 293 au- 
thors with 3 534 724 connections and 210 750 papers) 
journals issued in the period 1991-1998 were scanned in 
Refs. PJlQl]]. De gree distributions of these collaborat- 
ing networks were fitted by power laws with exponents 
2.4 (M) and 2.1 (NS). What is important, it was found 
that the mean degrees of these networks were not con- 
stant but grew linearly as the numbers of their vertices 
increased. Hence, the networks became more dense. The 
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average shortest-path lengths in these graphs and their 
clustering coefficients decrease with time. 

New edges were found to be preferentially attached to 
vertices with the high number of connections. The prob- 
ability that a new vertex is attached to a vertex with 
a degree k was proportional to k v with the y exponent 
equal to 0.8 ±0.1, so that some deviations from a lin- 
ear dependence were noticeable. However, new edges 
emerged between the pairs of already existing vertices 
with the rate proportional to the product of the degrees 
of vertices in a pair. 

Very similar results were also obtained for the actor 
collaboration gr aph consisting of 392 340 vertices and 



domain 



domain 



33 646 882 edges [101 



In Ref. [103], the preferential attachment process 
within collaboration nets of the Medline database (1994- 
1999: 1 648 660 distinct names) and the Los- Alamos E- 
print Archive (1995-2000: 58 342 distinct names) was 
studied. In fact, a relative probability that an edge added 
at time t connects to a vertex of degree k was measured. 
This probability was observed to be a linear function of 
k until large enough degrees, so that a linear preferential 
attachment mechanism operates in such networks (com- 
pare with Ref. (51|). However, the empirical dependence 
saturated for k > 150 in the Los- Alamos E-print Archive 
collaboration net or even fell off for k > 600 in the Med- 
line network. 



C. Communications networks, the WWW, and the 
Internet 



Roughly speaking, the Internet is a net of intercon- 
nected vertices: hosts (computers of users), servers (com- 
puters or programs providing a network service that also 
may be hosts), and routers that arrange traffic across 
the Internet, see Fig. ||. Connections are undirected, 
and traffic (including its direction) changes all the time. 
Routers are united in domains. In January of 2001, 
the Internet contained already about 100 millions hosts. 
However, it is not the hosts that determine the structure 
of the Internet, but rather, routers and domains. In July 
of 2000, there were about 150 000 routers in the Internet 



104 1 . Latter, the number rose to 228 265 (data from Ref. 
). Thus, one can consider the topology of the Inter- 




hosts 
domain 

FIG. 4. Naive scheme of the structure of the Internet jE). 

The World Wide Web is the array of its documents 
plus hyper-links - mutual references in these documents. 
Although hyper-links are directed, pairs of counter-links, 
in principle, may produce undirected connections. Web 
documents are accessible through the Internet (wires and 
hardware), and this determines the relation between the 
Internet and the WWW. 

1. Structure of the Internet 

On the inter-domain level, the Internet is a really small 
sparse network with the following basic characteristics 
||. In November of 1997, it consisted of 3 015 vertices 
and 5 156 edges, so the average degree was 3.42, the max- 
imal degree of a vertex equaled 590. In April of 1998, 
there were 3 530 vertices and 6 432 edges, the average de- 
gree was 3.65, the highest degree was 745. In December 
of 1998 there were 4 389 vertices and 8 256 edges, so the 
average degree was 3.76 and the maximal degree equaled 
979. The average shortest path is found to be about 4 
as it should be for the corresponding classical random 
graph, the maximal shortest path is about 10. 

The degree distribution of this network was reported 
to be of a power-law form, P(k) cx fc~ 7 where 7 ~ 2.2 
(November of 1997 - 2.15, April of 1998 - 2.16, and De- 
cember of 1998 - 2.20) ||. In fact, it is hard to achieve 
this precision for a network of such a size. One may 
estimate the value of the expo nent using the highest de- 
grees (see Eq. ( [52"|) in Sec. |IX D| ). Such estimations 
confirm the reported values. For November of 1977, one 
gets 7 w 1 + In 3015/ In 590 = 2.22, for April of 1998 - 
7 w 2.24, and for December of 1998 - 7 w 2.26. One 
should note that, in paper ||, the dependence of a node 
degree on its rank, fc(r), was also studied. A power law 



(Zipf law) was observed, k(r) 



"S but, as one can 



net on a router level or inter-domain topology || . In the 
latter case, it is actually a small network. 



check, the reported values of the £ exponent are incon- 
sistent with the corresponding ones of 7. 

On the ro uter level, according to relatively poor data 
from 1995 p. 106], the Internet consisted of 3888 vertices 
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and 5012 edges, with the average degree equal to 2.57 and 
the maximal degree equal to 39. The degree distribution 
of this network was fitted by a power-law dependence 
with the exponent, 7 f» 2.5. Note that the estimation 
from the maximal degree value gives a quite different 
value, 7 w 1 + In3888/ln39 = 3.3, so that the empirical 
value of the 7 exponent is not very reliable. 

In 2000, the Internet has already consiste d of about 
150 000 routers connected by 200 000 links pi} . The 
degree distribution was found to "lend some support to 
the conjecture that a power law governs the degree dis- 
tribution of real networks" [104]. If this is true, one can 
estimate from this degree distribution that its 7 exponent 
is about 2.3. 

In Ref. ||, the distribution of the eigenvalues of the 
adjacency matrix of the Internet graph was studied. The 
ranking plots for large eigenvalues A(r) was obtained 
(enumeration is started from the largest eigenvalue) . For 
all three studied inter-domain graphs, approximately, 



A(r) ex 



-0.5 



From this we get the form of the tail of 



the eigenvalue spectra, G(A) oc A~ (1+1 /°- 5) = A -3 (we 
used the relation between the exponent of the di stribu - 
tion and the ranking one that is discussed in Sec. IX D| ). 
For the inter-router-95 graph, A(r) cx r~ 2 . Note that 
these dependences were observed for only the 20 largest 
eigenvalues. 

More recent data on the structure of the Inter- 
net are collected by the National Laboratory for Ap- 
plied Network research (NLANR). On its Web site 
http: / /moat .nlanr.net / , one can find extensive Internet 
routing related information being collected since Novem- 
ber 1997. For nearly each day of this period, NLANR 
has a map of connections of operating "autonomous sys- 
tems" (AS), which approximately map to Internet Ser- 
vice Providers. These maps (undirected networks) are 
closely related to the Internet graph on the inter-domain 
level. 

For example, on 14.11.1997, there were observed 3042 
AS numbers with 5595 interconnections, the average de- 
gree was k = 3.68; on 09.11.1998, these values were 4 301, 
8 589, and 3.99, respectively; on 06.12.1999, were 6 301, 
13 485, and 4.28, but on 08.12.1999, there were _only 768 
AS numbers and 1 857 interconnections (!), so k = 4.84. 
Hence, fluctuations in time are very strong. 

The statistical analysis of these data was made in Ref. 
p6[ . The data were averaged, and for 1997 the following 
average values were obtained. The mean degree of the 
network was equal to 3.47, the clustering coefficient was 
0.18, and the average shortest-path length was 3.77. For 
1998, the corresponding values were 3.62, 0.21, and 3.76 
respectively. For 1999, they were 3.82, 0.24, and 3.72 
respectively. The average shortest-path lengths are close 
to the lengths for corresponding classical random graphs 
but the clustering coefficients are very large. Notice that 
the density of connections increases as the Internet grows. 
One may say, the Internet shows accelerated growth. In 
Ref. [107], the dependence of the total number of inter- 
connections (and the average degree) on the number of 



AS was fitted by a power law. Unfortunately the varia- 
tion ranges of these quantities are too small to reach any 
reliable conclusion. 

In Ref. |9(|, the following problem was considered. 
New edges can connect together pairs of new and old, 
or old and old vertices. Were do they emerge, between 
what particular vertices? The mean ratio of the number 
of new links emerging between new and old vertices and 
the number of new connections between already existing 
vertices was 0.34, 0.48, and 0.53 in 1997, 1998, and 1999, 
respectively. Thus the Internet structure is very distinct 
from citation graphs. 

The degree distributions for each of these three years 
were found to follow a power law form with the exponent 
7 w 2.2, which is in agreement with Ref. ||. Further- 
more, in Ref. |36|, from the data of 1998, the dependence 
of the average degree of the nearest neighbors of a ver- 
tex on its degree, k nn {k) was obtained. This slowly de- 
creasing function was approximately fitted by a power 
law with the exponent 0.5. Such a dependence indicates 
strong correlations in the distribution of connections over 
the network. Vertices of large degree usually have weakly 
connected nearest neighbors, and vice versa. 

Notice that the measurement of the average degree of 
the nearest neighbors of a vertex vs. its degree is an effec- 
tive way to measure correlations between degrees of sep- 
arate vertices. As explained above, direct measurement 
of the joint distribution P(k\,k2) is difficult because of 
inevitably poor statistics. 

In principle, the behavior observed in Ref. J9(| is typi- 
cal for citation graphs gro wing under mechanism of pref- 



erential linking (see Sec. IX I). However, as indicated 



above, most of connections in the Internet emerge be- 
tween already existing sites. If the process of attach- 
ment of these edges is preferential, strongly connected 
sites usually have strongly connected nearest n eighb ors, 



unlike what was observed in Ref. [|96j (see Sec. [XI ). A 
difficulty is that vertices in the Internet are at least of two 
distinct kinds. In Ref. [Q, the difference between "stub" 
and "transit domains" of the Internet is noticed. Stub 
domains have no connections between them and connect 
to transit domains, which are, contrastingly, well inter- 
connected. Therefore, new connections or rewirings are 
possible not between all vertices. This may be reason of 
the observed correlations. A different classification of the 



Internet sites was used in Ref. [108]. The vertices of the 



Internet were separated into two groups, namely "users" 
and "providers" . Interaction between these two kinds of 
sites leads to the self-organization of the growing network 
into a scale-free structure. 

The process of the attachment of new edges in thes e 
maps of Internet was empirically studied in Ref. [ 101 1 . 
It was found that the probability that a new edge is at- 
tached to a vertex is a linear function of the vertex degree. 

A very important feature of the Internet, both on the 
AS (or the inter-domain) level and on the level of routers, 
is that its vertices are physically attached to specific 
places in the world and have their fixed geographic co- 
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ordinates. The geographic places of vertices and the dis- 
tribution of Euclidean distances are essential for the re- 
sulting structure of the Interne t. T his factor was studied 
and modeled in a recent paper [105]. It was observed that 
routers and AS correlate with the population density. All 
three sets - population, router, and AS space densities - 
form fractal structures in space. The fractal dimensions 
of these fractals were found to be approximately 1.5 (the 
data for North America). Maps of AS and the map of 
228 265 routers were analysed. In particular, the average 
shortest distance between two routers was found to be 



new node 



approximately 9 [ 105 1 



In Ref. [109], the structure of the Internet was consid- 
ered using an analogy with river networks. In such an 
approach, a particular terminal is treated as the outlet 
of a river basin. The paths from this terminal to all other 
addresses form the structure of this basin. As for usual 
river networks, the probability, P(n), that a (randomly 
chosen) point connects n other points uphill, can be in- 
troduced. In fact, n is the size of the basin connected to 
some point, and P(n) is the distribution of basin sizes. 
For river networks forming a fractal structure [110], this 
distribution is of a power-law form, P(n) cx n~ T , where 



values of the r exponent are slightly lower 3/2 [111 |. For 
the Internet, it was found that r = 1.9 ± 0.1 IPS], 



2. Structure of the WWW 



Let us first discuss, how the Web grows, that is, how 
new pages appear in it (see Fig. ||). Here we describe 
only two simple ways to add a new document. 

(i) Suppose, you want create your own personal home 
page. First you prepare it, put references to some pages 
of the Web (usually several references but, in principle, 
the references may be absent), etc. But this is only the 
first step. You have to make it accessible in the Web, to 
launch it. You come to your system administrator, he 
puts a reference to it (usually one reference) in the home 
page of your institution, and that is more or less all - 
your page is in the World Wide Web. 

(ii) There is another way of having new documents 
appear in the Web. Imagine that you already have your 
personal home page and want to launch a new document. 
The process is even simpler than the one described above. 
You simply insert at least one reference to the document 
into your page, and that is enough for the document to 
be included in the World Wide Web. We should note also 
that old documents can be updated, so new hyper-links 
between them can appear. Thus, the WWW growth is 
much more complex process than the growth of citation 
networks. 




FIG. 5. Scheme of the growth of the WWW (compare 
with Fig. A new document (page) must have at least one 
incoming hyper-link to be accessible. Usually it has several 
references to existing documents of the Web but, in principle, 
these references may be absent. Old pages can be updated, 
so new hyper-links can appear between them. 



The structure of the WWW was studied experimen- 
tally in Refs. p|-^| j90|j9lt | and the power-law form of var- 
ious distributions was reported. These studies cover dif- 
ferent sub-graphs of the Web and even relate to its dif- 
ferent levels. The global structure of the entire Web was 
described in the recent paper . In this study, the crawl 
from Altavista is used. The most important results are 
the following. 

In May of 1999, from the point of view of Altavista, the 
Web consisted of 203 x 10 6 vertices (URLs, i.e., pages) 
and 1466 x 10 6 hyper-links. The average in- and out- 
degree were ki = k = 7.22. In October of 1999 there 
were already 271 x 10 6 vertices and 2130 x 10 6 hyper-links. 
The average in- and out-degree were ki = k a = 7.85. 
This means that during this period, 68 x 10 6 pages and 
664 x 10 6 hyper-links were added, that is, 9.8 extra hyper- 
links appeared per one additional page. Therefore, the 
number of hyper-links grows faster than the number of 
vertices. 

The in- and out-degree distributions are found to be 
of a power-law form with the exponents ji = 2.1 and 

70 = 2.7 that confirms earlier data of Albert et al (|] 
on the nd.edu subset of the WWW (325 000 pages). 
These distributions were also fitted by the dependences 
(k + Ci i0 ) _7i> ° with some constants c.; j0 | ]6l| . For the in- 
degree distribution, the fitting provides c, — 1.25 and 

71 = 2.10, and for the out-degree distribution, c = 6.94 
and 7 — 2.82. Note that the fit is only for nonzero 
in-, out-degrees ki,k Q . The probabilities P(ki = 0) and 
P(k a — 0) were not measured experimentally. The rela- 
tion between them can be found by employing Eq. (|J). 

The relative sizes of giant components yield a basic 
information about the global topology of a directed net- 
work, and, in particular, about the WWW. Let us assume 
that a large directed graph has both the giant weakly con- 
nected component (GWCC) and th e gian t strongly con- 
nected component (GSCC) (see Sec. HID). Then its gen- 
eral global structur e can be represented in the following 
form (see Fig. |) iJjll. 
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shall show that this definition is natural. 
One can write 



FIG. 6. Structure of a directed gra ph w hen the giant 
strongly connected component is present [112 (see the text). 



Also, the structure of the WWW (compare with Fig. 9 of Ref. 
||). If one ignores the directedness of edges, the network 
consists of the giant weakly connected component (GWCC) 
— actually, the usual percolating cluster — and disconnected 
components (DC) . Accounting for the directedness of edges, 
the GWCC contains the following components: (a) the gi- 
ant strongly connected component (GSCC), that is, the set of 
vertices reachable from its every vertex by a directed path; 
(b) the giant out-component (GOUT), the set of vertices ap- 
proachable from the GSCC by a directed path (includes the 
GSCC); (c) the giant in-component (GIN), contains all 
vertices from which the GSCC is approachable (includes the 
GSCC); (d) the tendrils (TE), the rest of the GSCC, i.e. 
the vertices which have no access to the GSCC and are not 
reachable from it. In particular, this part includes something 
like "tendrils" |J but also there are "tubes" and numerous 
clusters which are only "weakly" connected. Note that our 
definitions of the GIN and GOUT differ from the definitions of 
Refs. |,|l[: the GSCC is included into both GIN and GOUT, 
so the GSCC is th e inte rception of the GIN and GOUT. We 
shall show in Sec. XI B| that this definition is natural. 



At first, it is possible to extract the GWCC. The rest 
of the network consists of disconnected clusters - "dis- 
connected components" (DC). The GWCC consists of: 

(a) the GSCC - from each vertex of the GSCC, there 
exists a directed path to any other its vertex; 

(b) the giant out- component (GOUT) - the vertices 
which are reachable from the GSCC by a directed path, 
so that GOUT includes GSCC; 

(c) the giant in-component (GIN) - the vertices from 
which one can reach the GSCC by a directed path so that 
GIN includes GSCC; 

(d) the tendrils (T) - the rest of the GWCC. This 
part consists of the vertices which have no access to the 
GSCC and are not reachable from it. In particular, it 
includes indeed something like "tendrils" but also there 
are "tubes" and numerous clusters which are only weakly 
connected. 

Notice that, in contrast to Refs. |]|51J], the above de- 



and 



Network = GWCC + DC 



GWCC = GIN + GOUT - GSCC + TE . 



fined GIN and GOUT include GSCC. In Sec. XI B we 



According to Ref. @, in May of 1999, the entire Web, 
containing 203 x 10 6 pages, consisted of 

- the GWCC, 186 x 10 6 pages (91% of the total number 
of pages), and 

- the DC, 17 x 10 6 pages. 

In turn, the GWCC included: 

- the GSCC, 56 x 10 6 pages, 
the GIN, 99 x 10 6 pages, 

- the GOUT, 99 x 10 6 pages, and 

- the TE, 44 x 10 6 pages. 

Both distributions of the sizes of strongly connected 
components and of the sizes of weakly connected ones 
were fitted by power-law dependences with exponents ap- 
proximately 2.5. 

The probability that a directed path is present between 
two random vertices was estimated as 24%. For pairs of 
pages of the WWW between which directed paths exist, 
the average shortest- directed-path length equals 16. For 
pairs between which at least one undirected path exists, 
the average shortest-undirected-path length equals 7. 

The value of the average shortest-directed-path length 
estimated from data extracted from the nd.edu subset of 
the WWW was 19 [§. This first published value for the 
"diameter" of the Web was obtained in a non-trivial way 
(it is not so easy to find the shortest path in large net- 
works), (i) The in-degree and out-degree distributions 
were measured in the nd.edu domain, (ii) A set of small 
model networks of different sizes N with these in-degree 
distribution and out-degree distribution was constructed, 
(iii) For each of these networks, the average shortest-path 
length I was found. Its size dependence was estimated 
as 1{N) « 0.35 + 2.06 lgiV. (iv) 1(N) was extrapolated 
to N — 800 000 000, that is, the estimation of the size of 
the WWW in 1999. The result, i.e. ((800 000 000) « 19, 
is very close to the above cited value ^(200 000 000) = 16 
of Ref. H if one accounts for the difference of sizes. 

The maximal shortest path between nodes belonging 
to the GSCC equals 28. The maximal shortest directed 
path for nodes of the WWW between which a directed 
path exists is greater than 500 (some estimates indicate 
that it may be even 1000). 

Although the GSCC of the WWW is rather small, most 
pages of the WWW belong to the GWCC. Furthermore, 
even if all links to pages with in-degree larger than 2 are 
removed, the GWCC does not disappear. This is clearly 
demonstrated by the data of Ref. || : 

The size of the GWCC of the Web (visible by Altavista 
in May 1999) is 186 x 10 6 pages. 
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If all in-links to pages with fej > j e ( rnax *> = 
1000, 100, 10, 5, 4, and 3 are removed, the size of the re- 
taining GWCC is 177 x 10 6 ,167 x 10 6 , 105 x 10 6 ,59 x 
10 6 ,41 x 10 6 , and 15 x 10 6 pages, respectively. 

The Web grows much faster than the possibilities of 
hardware. Even the best search engines index less than 
one half of all pages of the Web gill,!!]. U P date of 
files cached by them for quick search usually takes many 
months. The only way to improve the situation is index- 
ing of special areas of the WWW, "cyber-communities" , 
to provide possib ility of a n efficient specialized search 
0,HJ|3|8|,^,|l|,|ll|-|ll|l . 



authorities 




hubs 



FIG. 7. A bipartite directed sub- grap h in the Web being 
used for indexing cyber-communities |90 bll] . 



than it should be for the corresponding classical random 
graph. The data were extracted from the same crawl 
containing 259 794 sites. 

Several other empirical distributions were obtained, 
which do not relate directly to the global structure of 
the Web but indicate some of its properties. Huberman 
and Adamic || found that the distribution of the num- 
ber of pages in a Web site also demonstrates a power-law 
dependence (Web site is a set of linked pages on a Web 
server). From their analysis of sets of 259 794 and 525 882 
Web sites covered by Alexa and Infoseek it follows that 
the exponent in this power law is about 1.8. Note that 
the power-law dependence seems not very pronounced in 
this case. A power-law dependence was indicated at the 
distributio n of the number of visits (connections) to the 
Web sites [119]. The value of the corresponding exponent 
was estimated as 2.0. The fit is rather poor. 

One should stress that usually what experimentalists 
indicate as a power-law dependence is actually a lin- 
ear fit for a rather narrow range on a log- log plot. It 
is nearly impossible to obtain some functional form for 
the degree distribution directly because of strong fluc- 
tuations. To avoid them, the cumulative distribution 
P C um(k) = dkP(k) is usually used [p8l. Neverthe- 
less, the restricted sizes of the studied networks often 
lead to implausible interpret ation (se e the discussion of 
the finite size effects in Sees. [X C and IX D ). One has to 
keep this in mind while working with such experimental 
data. 



Natural objects for such indexing are specific bipartite 
sub-graphs (see Fig. |^) j9(j|nj . One should note that 
the directed graphs of this kind have a different struc- 
ture than the bipartite graphs described in Sec. VB. 



After separation from the other part of a network, they 
consist of only two kinds of nodes - "hubs" (fans) and 
"authorities" (idols). Each hub connects to all the au- 
thorities of this graph. Let it be h hubs and a authorities 
in the bipartite graph. Each of hubs, by definition, must 
have a links directed to each of a authorities. Hence, the 
number of links between subsets of hubs and authorities 
equals ha. Some extra number of connections may be 
inside of these two subsets. 

The distribution of the number of such bipartite sub- 
graphs in the Web, Nb(h, a) was studied in Refs. p(i[jj| . 
For a fixed number of hubs, Nb(h — fixed, a) resembles a 
power-law dependence, and for a fixed number authori- 
ties, Nb(h, a — fixed) resembles an exponential one when 
h is small. We should note that these data are poor. 

One can also consider the struc ture of the Web on an- 
other level. In particular, in Ref. |l!7| , the in-degree dis- 
tribution for the domain level of Web in spring of 1997 
was studied, where each vertex (Web site) is a separate 
domain name, and the value 1.94 for the corresponding 
exponent was reported. The network consisted of 259 794 
vertices. 

Measureme nts o f the clustering coefficient of the Web 
on this level |118| have shown that it is much larger 



D. Biological networks 

1. Structure of neural networks 

Let us consider the rich structure of a neural net- 
work of a tiny organism, classical C. elegans. 282 neu- 
rons form the network of directed links with average de- 
gree k — 14 pyr^ . The in- and out-degree distribu- 
tions are exponential. The average shortest-path length 
measured without account of directness of edges is 2.65, 
and the clustering coefficient equals 0.26. Therefore, the 
network displays the small- world effect, and the clus- 
tering coefficient is much larger than the characteris- 
tic value for the corresponding classical random graph, 
C* = 0.26> 14/282 - 0.05. 



2. Networks of metabolic reactions 

The valuable example of a biological network with the 
extremely rich topological structure is provided by the 
network of metabolic reactions (3^JU,^7|. This isa par- 
ticular case of chemical reactions graphs |37, 120, 121 1. At 
present, such networks are documented for several or- 
ganisms. Their vertices are substrates - molecular com- 
pounds, and the edges are metabolic reactions connecting 
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substrates. According to jyj (see also |122| ]), incoming 
links for a particular substrate are reactions in which it 
participates as a product. Outgoing links are reactions 
in which it is an educt. 

Sizes of such networks in 43 organisms investigated in 
PI are between 200 and 800. The average shortest-path 
length is about 3, ki ~ k a ~ 2.5 — 4.0. Although the net- 
works are very small, the in- and out-degree distributions 
were interpreted J4l| as scale- free, i.e., of a power-law 
form with the exponents, ~ "f f=a 2.2. 

In Ref. |123| , one may find another study of the global 
structure of metabolic reaction networks. The networks 
were treated as undirected. For a network of the Es- 
cherichia coli, consisting of 282 nodes, the average de- 
gree k ~ 7. The average shortest-path length was found 
to be equal to 2.9. The clustering coefficient is C » 0.3, 
that is, much larger than for the corresponding classical 
random network, 7/282 w 0.025. 

The distribution of short cycle s in large metabolic net- 
works is considered in Ref. 124 1. 



3. Protein networks 

A genomic regulatory system can be thought of as an 
extremely large directed network |p7|| . Vertices in this 
network are distinct components of the genomic regu- 
latory system, and each directed edge points from the 
regulating to the regulated component. 

A very important aspect of gene function is protein- 
protein interactions - "the number and identity of pro- 
teins with which the products of duplicate genes in an 
organism interact" (see Ref. (4j| for a brief introduction 
in the topic). The vertices of the protein-protein inter- 
action network are proteins and the directed edges are, 
usually, pairwise protein-protein interactions. Two ver- 
tices may be connected by a pair of opposing edges, and 
the network also contains unit loops, so that its general 
structure resembles the structure of a food web (see Fig. 
^). Recently large maps of protein-protein interaction 
networks were obtained which may be used for 

structural analysis. 

In Ref. HJ] (for details see Ref. the distribution 

of connections in the protein-protein interaction network 
of the yeast, S. cerevisiae was studied using the map from 
Ref. @ (see also Ref. g|). The network contains 1870 
vertices and 2 240 edges. The degree distribution was in- 
terpreted as a power-law (scale-free) dependence with an 
exponential cut-off at the point k c ss 20. This value is 
so small that it is difficult to find the exponent of the 
degree distribution. The approximate value 7 ~ 2.5 was 
obtained in Ref. @ . 

In addition, in Ref. j44|], the tolerance of this network 
against random errors (random deletion of proteins) and 
its fragility against the removal of the most connected 
vertices were studied. The random errors were found to 
be rather non-dangerous, but single deletion of one of the 



most connected proteins (having more than 15 links) was 
lethal with high probability. 



4- Ecological and food webs 

Food webs of species-rich ecosystems are directed 
networks, where vertices are distinct species, and di- 
rected e dges co nnect pairs — _a_ specie-eater and its 

food r 



)1 



125 1 . In Refs. 



structures of three 



food webs were studied ignoring the directedness of their 
edges. 

The networks considered in Refs. [ ^8|j4S[| are very small. 

(i) The food web of Ythan estuary consists of N = 93 
vertices. The average degree is k = 8.70, the average 
clustering coefficient is equal to G = 0.22, the average 
shortest-path length is 1 — 2.43. 

(ii) Silwood park web (more precisely speaking, this is 
a sub-web). N = 154, k = 4.75, C = 0.15, 1 = 3.40. 

_ (iii) The food web of Little Rock lake. N = 182, 
k = 26.05, C = 0.35, £ = 2.22. 

The clustering coefficients obtained for these networks 
essentially exceed the corresponding values for the classi- 
cal random graphs with the same total number of vertices 
and edges. However, the measured average shortest-path 
lengths of these webs do not deviate noticeably from the 
corresponding values for the classical random graphs. 

Furthermore, the degree distributions of the first two 
webs were fitted by power laws with the exponents 7 ~ 
1.0 and 7 w 1.1 for the Ythan estuary web and for the Sil- 
wood park web, correspondingly. This allowed authors 
of Refs. Jl8],[l9| to consider them as scale-free networks 
(however, see Refs. []50],[3l| where the degree distributions 
in such food webs were interpreted as of an exponential- 
like form). These are the smallest networks for which a 
power-law distribution was ever reported. For the third 
food web, any functional fitting turned to be impossible. 

Additionally, in Ref. ^9|, the stability of food webs 
against random or intentional removal of vertices was 
considered. The results were typical for scale-free net- 
works (see Sec. XI C). 




FIG. 8. Typical food web. Cannibalism and mutual eat- 
ing are widespread. 

Food webs have a rather specific structure. They are 
directed, include unit loops, that is, cannibalism, and two 
opposin g edge s may connect a pair of vertices (mutual 
eating) |4^j5^] (see Fig. |[ compare with the structure 
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of a protein-protein interaction network) . Therefore, the 
maximal possible number of edges (trophic links) in a 
food web containing N vertices (trophic species) is equal 
to 2N(N-l)/2+N = N 2 . Food webs are actually dense: 
the total number L of edges is high. The values of the ra- 
tion L/N 2 for seven typical food webs with N — 25 — 92 
were found to be in the range between 0.061 and 0.32 
fl7f . Authors of Ref. |52| observed that this leads to an 
extreme smallness of food webs. Edges were treated as 
undirected and the average shortest-path lengths were 
then measured to be in the range between 1.44 and 2.55. 

We should emphasize that it is hard to find well de- 
fined and large food webs. This seriously hinders their 
statistical analysis. 



5. Word Web of human language 



Ferrer and Sole (2001) [126] constructed a net of funda- 
mental importance, namely the network of distinct words 
of human language. Here we call it Word Web. The Word 
Web is constructed in the following way. The vertices of 
the web are the distinct words of language, and the undi- 
rected edges are connections between interacting words. 
It is not so easy to define the notion of word interac- 
tion in a unique way. Nevertheless, different reasonable 
definitions provide very similar structures of the Word 
Web. For instance, one can connect the nearest neigh- 
bors in sentences. Without going into details, this means 
that the edge between two distinct words of language ex- 
ists if these words are the nearest neighbors in at least 
one sentence in the bank of language. In such a defini- 
tion, multiple links are absent. One also may connect the 
second nearest neighbors an d ac count for other types of 
correlations between words [ 126 1. In fact, the Word Web 
displays the cooccurrence of the words in sentences of a 
language. 



Two slightly different methods were used in Ref. ]126| 
to construct the Word Web. The two resulting webs ob- 
tained after processing 3/4 million words of the British 
National Corpus (a collection of text samples of both spo- 
ken and written modern British English) have nearly the 
same degree distributions (see Fig. ^ and each contains 
about 470 000 vertices. The average number of connec- 
tions of a word (the average degree) is k « 72. As one sees 
from Fig. ^[ the degree distribution comprises two dis- 
tinct regions with quite different power-law dependences. 
The range of the degree variation is really large, so the 
result looks convincing. The exponent of the power law 
in the low-degree region is approximately 1.5, and in the 
high-degree re gion is close to 3 (the value 2.7 was re- 
ported in Ref. |126|). 
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FIG. 9. The distribution of the numbers of connections 



(degrees) of words in the word web in a log-log scale [126 



Empty and filled circles show the dis tributions of the number 
of connections obtained in Ref. [ 12£ ] for two different meth- 
ods of the construction of the Word We b. Th e solid line is 
the result of theory of Ref. [127] (see Sec. IX J where the pa- 
rameters of the Word Web, namely, the size t ~ 470 000 and 
the average number of connections of a node, k(t) w 72, were 
used. The arrows indicate the theoretically obtained point 
of crossover, k cross between the regions with different power 
laws, and the cutoff k cu t due to the size effect. For a better 
comparison, the theoretical curve is displaced upward to ex- 
clude two experimental points with the smallest k (note that 
the comparison is impossible in the region of the smallest k 
where the empirical distribution essentially depends on the 
definition of the Word Web). 

The complex empirical degree distribution of the Word 
Web was described without fitting u sing a simple model 
of th e evo lution of human language [127] (see Fig. ^ and 
Sec. pO|). 



E. Electronic circuits 



In Ref. [128], the structure of large electronic circuits 
was analysed. Electronic circuits were viewed as undi- 
rected random graphs. Their vertices are electronic com- 
ponents (resistors, diodes, capacitors, etc. in analog cir- 
cuits and logic gates in digital circuits) and the undi- 
rected edges are wires. The networks considered in Ref. 
[128 1 have sizes N in the range between 20 and 2 x 10 4 
and the average degree between 3 and 5. 

For these circuits, the clustering coefficients, the av- 
erage shortest-path lengths, and the degree distributions 
were obtained. In all the networks, the values of the aver- 
age shortest-path length were close to those for the corre- 
sponding classical random graphs with the same numbers 
of vertices and links. There was a wide diversity of val- 
ues of the clustering coefficients. However, all the large 
circuits considered in Ref. [128] (N > 10 4 ) have cluster- 
ing coefficients that exceed those for the corresponding 
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classical random graphs by more than one order of mag- 
nitude. 

The most interesting results were obtained for the de- 
gree distributions which were found to have power-law 
tails. The degree distributions of the two largest digi- 
tal circuits were fitted by power laws with the exponent 
7 w 3.0. Note that the maximal value of the number 
of connections of a component in these large circuits ap- 
proaches 10 2 . 



F. Other networks 



We have listed above only the most representative and 
well documented networks. Many kinds of friendship net- 
works may be a dded |l7|llil]2g|| . Polymers also form com- 
plex networks [129- 131 1. Even human sexual contacts 
were found to form a complex network. It was recently 
discovered |132| that this marvelous web is scale- free un- 
like friendship networks p3] which are exponential. 

One can introduce a call graph generated by long dis- 
tance telephone calls taken over some time interval [ p2[ . 
Vertices of this network are telephone numbers, and the 
directed links are completed phone calls (the direction 
is determined by the initiator of the talk). In Ref. [p2[ , 
calls made in a typical day were collected, and the net- 
work consisting of 47 x 10 6 nodes was constructed (note, 
however, that this network was probably generated and 
not obtained from empirical data). It was impossible to 
fit P(k ) by any power-law dependence but the fitting 
of the in-degree distribution P(fcj) gave % w 2.1. The 
size of the giant connected component is of the order of 
the network size, and all others connected components 
are of the order of the logarithm of this size or smaller. 
The distribution of the sizes of connected components 
was measured but it was hard to make any conclusion 
about its functional form. 

Basic data for all networks, in which power-law degree 
distributions were observed, are summarized in Table || 
and Fig. ^4| For each such network, the total numbers 
of vertices and edges, and the degree distribution expo- 
nent are presented (see discussion of scale-free networks 
in Sec. |x|). 

We finish our incomplete list with a power grid of the 
Western States Power Grid jll|,|l^,|28| (its vertices are 
transformers, substations, and generators, and edges are 
high- voltage transmission lines). The number of vertices 
in this undirected graph is 4 941, and the average de- 
gree k is 2.67. The average shortest-path length equals 
18.7. The clustering coefficient of the power grid is much 
greater than for the corresponding classical random net- 
work, C = 0.08 > 2.67/4941 - 0.0005 00)- The 
degree distribution of the network is exponential En] . 



VI. CLASSICAL RANDOM GRAPHS, THE 
ERD OS-RENYI MODEL 



The simplest and most studied network with undi- 
rected edges was introduced by Erdos and Renyi (ER 
model) In this network: 

(i) the total number of vertices, N, is fixed; 

(ii) the probability that two arbitrary vertices are con- 
nected equals p. 

One sees that, on average, the network contains 
pN(N — l)/2 edges. The degree distribution is binomial, 



P(k) 



N- 1 
k 



p k (l~p) 



N-l-k 



(4) 



so the average degree is k = p(N — 1). For large N, the 
distribution, Eq. (jj) takes the Poisson form, 



P(k) 



'/k\. 



(5) 



Therefore, the distribution rapidly decreases at large de- 
grees. Such distributions are characteristic for classical 
random networks. Moreover, in the mathematical liter- 
ature, the term "random graph" usually means just the 
network with a Poisson degree distribution and statis- 
tically uncorrelated vertices. Here, we prefer to call it 
"classical random graph" . 

We have already presented the estimate for an average 
shortest-path length of this network, I ~ In N / \a\pN\. 

At small values of p, the system consists of small clus- 
ters. At large N and large enough p, the giant con- 
nected component appears in the network. The perco- 
lation threshold is p c = 1/N, that is, k c = 1. 

In fact, the ER model describes percolation on a lat- 
tice of infinite dimension, and the adequate mean-field 
description is possible. 



VII. SMALL- WORLD NETWORKS 



In Sec. Ill B , we explained that random networks usu- 



ally show the so-called small-world effect, i.e., their av- 
erage shortest-path length is small. Then, in principle, 
it is natural to call them small- world networks. Watts 
and Strogatz |llj noticed the following important fea- 
ture of numerous networks in Nature. Although the av- 
erage shortest-path length between their vertices is really 
small and is of the order of the logarithm of their size, 
the clustering coefficient is much greater that it should 
be for classical random graphs. They proposed a model 
(the WS model) that demonstrates such a possibility and 
also called it the small-world network. The model be- 
longs to the class of networks displaying a crossover from 
ordered to random structures and may be treated ana- 
lytically. By definition of Watts and Strogatz, the small- 
world networks are those with "small" average shortest- 
path lengths and "large" clustering coefficients. 
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This definition seems a bit controversial, (i) According 
to it, numerous random networks with a small clustering 
coefficient are not small-world networks although they 
display the small-world effect, (ii) If one starts from a ID 
lattice with interaction only between the nearest neigh- 
bors, or from simple square or cubic lattices, the initial 
clustering coefficient is zero and it stays small during the 
procedure proposed by Watts and Strogatz although the 
network evidently belongs to the same class of nets as the 
WS model. In addition, as we will show, the class of net- 
works proposed by Watts and Strogatz provides only a 
particular possibility to get such a combination of the av- 
erage sho rtest-p ath length and the clustering coefficient 
(see Sec. |vTk|). 



Irrespective of the consistency of the definition of the 
small-world networks Jll],[l2| and its relation with real 
networks, the proposed type of networks is very inter- 
esting. In fact, the networks introduced by Watts and 
Strogatz have an important generic feature - they are 
constructed from ordered lattices by random rewiring of 
edges or by addition of connections between random ver- 
tices. In the present section, we consider mainly networks 
of such kind. 




a) 



rewiring of links 




b) 



addition of links 



FIG. 10. Small-world networks in which the crossover 
from a regular lattice to a random network is realized, (a) 
The original Watts-Strogatz model with the rewirin g of link s 



[ p"l| . (b) The network with the addition of shortcuts [ 13E 
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A. The Watts-Strogatz model and its variations 



Watts and Strogatz studied the crossover between 
these two limits. The main interest was in the average 
shortest path, I, and the clustering coefficient (recall that 
each edge has unit length). The simple but exciting re- 
sult was the following. Even for the small probability 
of rewiring, when the local properties of the network are 
still nearly the same as for the original regular lattice and 
the clustering coefficient does not differ essentially from 
its initial value, the average shortest-path length is al- 
ready of the order of the one for classical random graphs 
(see Fig. 11). 



The original network of Watts and Strogatz is con- 
structed in the following way (see Fig. |l^,a). Initially, 
a regular one dimensional lattice with periodical bound- 
ary conditions is present. Each of L vertices has z > 4 
nearest neighbors (z — 2 was not appropriate for Watts 
and Strogatz since, in this case, the clustering coefficient 
of the original regular lattice is zero). Then one takes 
all the edges of the lattice in turn and with probability 
p rewires to randomly chosen vertices. In such a way, a 
number of far connections appears. Obviously, when p is 
small, the situation has to be close to the original reg- 
ular lattice. For large enough p, the network is similar 
to the classical random graph. Note that the periodical 
boundary conditions are not essential. 
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FIG. 11. Average shortest-path length £ and clustering 
coefficient C of the Watts-Strogatz model vs. fraction of the 
rewired links p fn| . Both are normalized to their values for 
the original regular lattice (p — 0). The network has 1000 
nodes. The average number of the nearest neighbors equals 
10. C is practically constant in the range where I sharply 
diminishes. 

This result seems quite natural. Indeed, the average 
shortest-path length is very sensitive to the short-cuts. 
One can see, that it is enough to make a few random 
rewirings to decrease £ by several times. On the other 
hand, several rewired edges cannot crucially change the 
local properties of the entire network. This means that 
the global properties of the network change strongly al- 
ready at pzL ~ 1 , when there is one shortcut in the net- 
work, i.e., at p ~ l/(Lz), when the local characteristics 
are still close to the regular lattice. 

Recall that the simplest local characteristic of nets is 
degree. Hence, it would be natural to compare, at first, 
the behavior of £ and k. However, in the originally for- 
mulated WS model, k is independent on p since the total 
number of edges is conserved during the rewiring. Watts 
and Strogatz took another characteristic for comparison 
- the characteristic of the closest environment of a vertex, 
i.e., the clustering coefficient C. 

Using the rewiring procedure, a network with a small 
average shortest-path length and a large clustering coef- 
ficient was constructed. Instead of the rewiring of edges, 
one can add shortcuts to a regular lattice (see Fig. |lO|,b) 
193,133-1351. The main features of the model do not 



change. One can also start with a regular lattice of an ar- 
bitrary dimension d where the number of vertices N = L 
[ 136 , 137 1 . In this case, the number of edges in the regular 
lattice is zL d /2. To keep the correspondence to the WS 
model, let us define p in such a way that for p — 1, zL d /2 
random shortcuts are added. Then, the average number 
of shortcuts in the network is N s = pzL d /2. At small 
N s , we have two natural lengths in the system, I and L, 
since the lattice spacing is not important in this regime. 
Their dimensionless ratio can be only a function of N s , 



L 



= f(2N s ) = f(pzL d ) 



(6) 



where /(0) ~ 1 for the original regular lattice and f(x ^> 
1) ~ Inx/x 1 ^. From Eq. (0), one can immediately ob- 
tain the following relation, 1(pz) 1 ^ d = g(L(pz) 1 ^ d ). Here, 
£ = {pz)~ x / d has the meaning of a length: N s £ d ~ L d , 
it is the average distance between the closest end points 
of shortcuts measured on the regular lattice. In fact, one 
must study the limit L — > oo, p — > 0, as the number of 
shortcuts N s = pzL d /2 is fixed. The last relation for £, in 
the case d = 1, was proposed and studie d by sim ulation 



in Ref. [138] and afterwards analytically [139,140]. 

The WS model and its variations seem exactly solv- 
able. Nevertheless, the only known exact result for the 
WS model is its degree distribution. It was fou nd t o be a 
rapidly decreasing function of a Poisson kind [ 140 j . The 



exact form of the shortest-path length distribution s ha s 
been fou nd onl y for the simplest model in this class [ 141 1 , 
see Sec. IVIIBl. 




FIG. 12. 



iog 10 (pzL) 

Scaling of the average shortest-path length of 
The combination iz/L vs. pzL 



"small- world" networks [134 



for the network constructed by the addition of random short- 
cuts to a one-dimensional lattice of the size L with the coor- 
dination number z. 



Many efforts were directed to the calculation of the 
scaling function f(x) describing the crossover between 



two limiting regimes |l3lH^5|Jl^Jl40|JT4^ |l46| . As we 



have already explained, the average shortest-path length 
rapidly decreases to values characteristic for classical ran- 
dom networks as p grows. Therefore, it is convenient to 
plot f(x) in log-linear scales (see Fig. O). 



One m ay s tudy the distribution of diseases on such 
networks [147]. In Fig. [ll| a portion of "infected" nodes, 
rii/L, in the network is shown vs. time passed after some 
vertex was infected [ 135 1 . At each time step, all the near- 
est neighbors of each infected vertex fall ill. At short 
times, rii/L cx t d but then, at longer times, it increases 
exponentially until the saturation at the level rii/L = 1 . 
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world where substantial number of connections occurs 
through common centers (see Fig. [l5|) . 



FI G, 13. Spreading of diseases in "small-world" networks 
[135]. The average fraction of infected nodes, rii/L, vs. the 



elapsed time from the instant when the first vertex "fell ill" . 

It is po ssible to con sider variou s proble ms for these 
networks [pd?j| , p8Hl62| . In Refs. |t^,|63|, percolation 
in them was studied (for infinitely large networks). Dif- 
fusion in the WS model and other related nets was con- 



sidered in [164]. 

It is easy to generalize the proc edure of rewiring or 
addition of edges. In Refs. [136,137], the following proce- 
dure was introduced. New edges between pairs of vertices 
of a regular e?-dimensional lattice are added with proba- 
bility p(r), where r is the Euclidean distance between the 
pair of vertices. If, e.g., p(r) oc exp(— const r), one gets 
a disordered d-dimensional lattice. Much slowly decreas- 
ing functions produce the small-world effect and related 
phenomena. In Refs. [136,137], one may find the study of 
diffusion on a finite size network in the case of a power- 
law dependence of this probability, p(r) oc r~ e . 



B. The smallest- world network 

Let us demonstrate the phenomena, which we discuss 
in the present section, using a trivial exactly solvable ex- 
ample, "the smallest-world network" (see Fig 



Lfl) [141 



We start from L vertices connected in a ring by L links of 
unit length, that is, the coordination number z equals 2 
and the clustering coefficient is zero. This is not essential 
for us since we have no intention to discuss its behavior 
(in such a case, instead of the clustering coefficient, one 
may consider the density of linkage or degree). Then, 
we add a central vertex and make shortcuts between it 
and each other vertex with probability p. One may as- 
sume that lengths of these additional edges equal 1/2. In 
fact, with probability p, we select random vertices and 
afterwards connect all of them together by edges of unit 
length. For the initial lattice, £(p = 0) = £/4, and, for 
the completely connected one, £(p = 1) = 1. One should 
note that such networks may be rather reasonable in our 




FIG. 14. The "smallest- world" network [141|. L vertices 
on the circle are connected by unit length edges. Each of 
these vertices is connected to the central one by a half-length 
edge with probability p. 




FIG. 15. The real "smallest-world" network. Unsociable 
inhabitants live in this village. Usually, they contact only 
with their neighbors but some of them attend the church... 

One may calculate the distribution P(£) of the 



shortest-path lengths I of the network exactly [ 141 1 . In 
the scaling limit, L — > oo and p — > 0, while the quantities 
p = pL (average number of added edges) and z = t/L 
are fixed, the distribution takes the form, 



LP(£,p) = Q(z, p) = 2[1 + 2pz + 2p 2 z{\ - 2z) 



,-2pz 



(7) 



This distribution is shown in Fig. [16]. The correspond- 
ing average shortest-path length between pairs of vertices 
equals 

^=z=^[2p-3+(p + 3)e-f], (8) 
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that i s just the scaling function f(x) discussed in Sec. 
VII A| (see Fig. |l|). Hence, z(p = 0) = 1/4 and 
~z{p 3> 1) — > 1//0, i.e., I — + One may also obtain the 
average shortest-path length (i)(k) between two vertices 
of the network separated by the "Euclidean" distance k, 
k/ L = x. In the scaling limit, we have 



0.25 



W(k, P ) 
L 



(z)(x) 



(1 + px)e 



-2pxl 



(9) 
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FIG. 17. The normalized average shortest-path length 
l/L of the "smallest-world" network vs. the number p — pL 
of added edges. 



(see Fig. yj). Obviously, (£)(k,p -> 0) 
tion is quickly achieved at large pk. 



k but satura- 



O 




FIG. 16. The distribution Q(z,p) = LP{l,p) of the 
normalized shortest-path lengths z = l/L of the "small- 
est-world" network. Here, L is the size of the network, 
p = pL. Curves labeled by numbers from 1 to 6 correspond 
top = 0,2, 5, 8, 11,14. 
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FIG. 18. The normalized average shortest-path length 
p{£) between two vertices of the "smallest-world" network sep- 
arated by the "Euclidean" distance k as a function of pk. 



Eqs. (0)-(^|) actually demonstrate the main features of 
the crossover phenomenon in the models under discussion 
although our toy model does not approach the classical 
random network at large p. £ of the model already dimin- 
ishes sharply in the range of p where local properties of 
the network are nearly the same as of the initial regular 
structure. In Ref. ]165[ , one can find the generalization 
of this model - the probability that a vertex is connected 
to the center is assumed to be dependent on the state of 
its closest environment. 
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C. Other possibilities to obtain large clustering 
coefficient 



The first aim of Watts and Strogatz (TTJ was to con- 
struct networks with small average shortest paths and 
relatively large clustering coefficients which can mimic 
the corresponding behavior of real networks. In their 
network, the number of vertices is fixed, and only edges 
are updated (or are added) . At least most of known net- 
works do not grow like this. Let us demonstrate a simple 
network with a similar combination of these parameters 
(I and C) but evolving in a different way - the growth of 
the network is due to both addition of new vertices and 
addition of new edges. 

In this model, initially, there are three vertices con- 
nected by three undirected edges (see Fig. |l9|). Let at 
each time step, a new vertex be added. It connects to 
a randomly chosen triple of nearest neighbor vertices of 
the network. This procedure provides a network display- 
ing the small- world effect. We will show below that this 
is a network with preferential linking. Its power-law de- 
gree distribution can be calculated exactly [166] (see Sec. 

Excl). 




FIG. 19. A simple growing network with a large cluster- 
ing coefficient. In the initial configuration, three vertices are 
present. At each time step, a new vertex with three edges is 
added. These edges are attached to randomly chosen triples 
of nearest neighbor vertices. 



At the moment, we are interested only in the cluster- 
ing coefficient. Initially, C = 1 (see Fig. |l^,a). Let us 
estimate its value for the large network. One can see that 
the number of triangles of edges in the network increases 
by three each time a vertex is added. Simultaneously, the 
number of triples of connected vertices increases by the 
sum of degrees of all three vertices to which the new ver- 
tex is connected. This sum may be estimated as 3fc. Here, 
k = 2(3t)/t = 6. Hence, using the definition of the clus- 
tering coefficient, we get C w 3(3t)/(3fct) = 3/fc = 1/2. 
Therefore, C is much larger than the characteristic value 
k/t for classical random graphs, and this simple network, 
constructed in a quite different way than the WS model, 
shows both discussed features of many real networ ks (see 
also the model with very similar properties in Sec. IX C , 
Fig. ^l|). The reason for such a large value of the clus- 
tering coefficient is the simultaneous connection of a new 
vertex to nearest neighboring old vertices. This can par- 
tially explain the abundance of networks with large clus- 
tering coefficient in Nature. Indeed, the growth process, 
in which some old nearest neighbors connect together to 
a new vertex, that is, together "borne" it, seems quite 



natural (see Ref. [ 103 1 ) . 

Another possibility to obtain a large clustering coeffi- 
cient in a growing network is connecting a new vertex to 
several of its immediate predecessors w ith high probabil- 
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ity (see also models proposed in Refs. 

We should add that the one-mode projections of bipar- 
tite rando m gr aphs also have large clustering coefficients 
(see Sees. |VB| and |XIA|) . 



VIII. GROWING EXPONENTIAL NETWORKS 



The classical random network considered in Sec. VI 



has fixed number of vertices. Let us discuss the simplest 
random network in which the number of vertices grows 
|pq , p6| . At each increment of time, let a new vertex be 
added to the network. It connects to a randomly cho- 
sen (i.e., without any preference) old vertex (see Fig. ||). 
Let connections be undirected, although it is inessential 
here. The growth begins from the configuration consist- 
ing of two connected vertices at time t = 1, so, at time 
t, the network consists of t + 1 vertices and t edges. The 
total degree equals 2t. One can check that the average 
shortest-path length in this network is I ~ Int like in 
classical random graphs. 

It is easy to obtain the degree distribution for such 
a net. We may label vertices by their birth times, s = 
0, 1, 2, . . . , t. Let us introduce the probability, p(k 7 s, i), 
that a vertex s has degree k at time t. The master equa- 
tion describing the evolution of the degree distribution of 
individual vertices is 



p(k, s,t + 1) 



1 



t + 1 



p(k-l,s,t)+[ 1 



1 



t + 1 



p(k,s,t) 
(10) 



p(k,s = 0,1, t = 1) = 6{k,s = t,t>l) = S k ,i. This 
accounts for two possibilities for a vertex s. (i) With 
probability l/(t + 1), it may get an extra edge from the 
new vertex and increase its own degree by 1. (ii) With 
the complimentary probability 1 — 1/(4+1) the vertex s 
may remain in the former state with the former degree. 
Notice that the second condition above makes Eq. ( |l0| ) 
non-trivial. 

The total degree distribution of the entire network is 



1 * 

^(M) = — - 5>(fc, M) 

s=0 



(11) 



Using this definition and applying 53 s =o ^° sides of 



Eq. (10), we get the following master equation for the 



total degree distribution, 

(f + l)P(k, t + 1)- tP{k, t) = P{k - 1, t) - P(k, t) + 4,i 

(12) 

The corresponding stationary equation, i.e., at t — » oo, 
takes the form 
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2P(k) - P(k - 1) = 4,1 



(13) 



IX. SCALE-FREE NETWORKS 



(note that the stationary degree distribution P(k) = 
P(fc,i — > oo) exists). It has the solution of an expo- 
nential form, 



P(k) = 2~ k . 



(14) 



Therefore, networks of such a type often are called "expo- 
nential" . This form differs from the Poisson degree distri- 
bution of classical random graphs, see Sec. VI. Neverthe- 
less, both distributions are rapidly decreasing functions, 
unlike degree distributions of numerous large networks in 
Nature. 

The average degree of vertex s at time t is 



As we saw in Sec. [v|, at least several important large 
growing networks in Nature are scale- free, i.e., their de- 
gree distributions are of a po wer-la w form (nevertheless, 
look at the remark in Sec. V C 2 concerning the qual- 
ity of the experimental material). The natural question 
is how they self-organize into scale-free structures while 
growing. What is the mechanism responsible for such 
self-organization? For explanation of these phenomena, 
the idea of preferential linking (preferential attachment 
of edges to vertices) has been proposed [j55"|]5rj|l . 



Barabasi-Albert model and the idea of 
preferential linking 



fc (M) = ^2p( k ,s,t) 



(15) 



We have demonstrated in Sec. VIII that if new connec- 



fc=i 



Applying J2kLi & to both sides of Eq. (p^|), we get the 
equation for this quantity, 



k(s,t + l) = k(s,t) + 



t + 1 



(16) 



The resulting average degree of individual vertices equals 

t—s _^ 

k(s, t) = 1 + V — — = 1 + V'(i + 1) - 4'{s + 1) (17) 

\k(0,t) = k(l,t)). Here, ip( ) is the V'-function, i.e. 
the logarithmic derivative of the gamma-function. For 
s,f> 1, we obtain the asymptotic form, 



k(s,t) = 1 - ln(s/t) 



(18) 



i.e., the average degree of individual vertices of this net- 
work weakly diverges in the region of the oldest vertex. 
Hence, the oldest vertex is the "richest" (of course, in the 
statistical sense, i.e., with high probability). 

From Eq. (|l0|), one can also find the degree distribu- 
tion of individual vertices, p(k, s, t), for large s and t and 
fixed s/t, 



p(k,s,t) = - 



1 



* (k + l)\ 



In 



k+l 



(19) 



One sees that this function decreases rapidly at large val- 
ues of degree k. 

Similar results may be easily obtained for a network 
in which each new vertex has not one, as previously, but 
any fixed number of connections with randomly chosen 
old vertices. In fact, all the results of the present section 
are typical for growing exponential networks. 



tions in a growing network appear between vertices cho- 
sen without any preference, e.g., between new vertices 
and randomly chosen old ones, the degree distribution 
is exponential. Nevertheless, in real networks, linking is 
very often preferential. 

For example, when you make a new reference in your 
own page, the probability that you refer to a popular 
Web document is certainly higher than the probability 
that this reference is to some poorly known document 
to that nobody referred before you. Therefore, popular 
vertices with high number of links are more attractive for 
new connections than vertices with few links - popularity 
is attractive. 

Let us demonstrate the growth of a network with 
preferential linking using, as the simplest example, the 
Barabasi-Albert model (the BA mode l) |55| . We return 
to the model described in Sec. VIII (see Fig. ||) and 
change in it only one aspect. Now a new vertex connects 
not to a randomly chosen old vertex but to a vertex cho- 
sen preferentially. 

We describe here the simplest situation: The probabil- 
ity that the edge is attached to an old vertex is propor- 
tional to the degree of this old vertex, i.e., to the total 
number of its connections. At time t, the total number 
of edges is t, and the total degree equals It. Hence, this 
probability equals k/(2t). One should emphasize that 
this is only a particular form of a preference function. 
However, just the linear type of the preference was in- 
dicated in several real netwo rks |l5| , |l5| ] (see discussion 
is Sees. VA, VB and V C l| ) . To account for the pref- 
erential linking, we must make obvious modifications to 
the master equation, Eq. (|lC|). For the BA model, the 
master equation takes the following form, 



p(k, s,t + l) 



1 



2t 



p(k-l,s,t)+ 1 



2t 



p(k,s,t) 
(20) 

with the initial condition p(k, s = 0, 1, t = 1) = 5k l and 
the boundary one p(k,t,t) = 5k,i- From Eqs. ( |ll| ) and 
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( pp| ) , we get the master equation for the total degree dis- 
tribution, 



(t + l)P{k,t + l)-tP{k,t) = 

h(k - i)p(k - 1, t) - kP(k, t)\ + 4,1 , 



(21) 



and, in the limit t 
distribution, 

1 



oo, the equation for the stationary 



P{k) + ~[kP(k) - (k- l)P(k - 1)] = 8 



fc,i 



(22) 



In the continuum k limit, this equation is of the form 
P(k) + (l/2)d[kP(k)]/dk = 0. The solution of the last 
equation is P(k) oc fc~ 3 . Thus, the preferential linking of 
the form that we consider provides a scale-free network, 
and the 7 exponent of its distri butio n equals 3 [^5] j57| . 
This value is exact, see Ref. J94,169] and the discussion 
below. 

We emphasize that the preferential linking mechanism 
|35| , [56f (1999) is the basic idea of the modern theory of 
evolving networks. Notice that preferential attachment 
may also arise effectively, in an indirect way (e.g., see 
Sec. pCC| and models from Refs. p7| , pl8| , p0|[ , 
cent empirical data [15,101 103| (see Sees. V A|, |VB 



The re- 
and 



V C ) on the dynamics of the attachment of new edges in 
various growing networks provide support for this mech- 
anism. 



B. Master equation approach 



The master equation approach ]169| is very efficient 
for problems of the network evolution. Indeed, the linear 
discrete difference equations that arise (usually of first 
order) can be easily solved, e.g., using Z-transform. Let 
us describe the degree distributions for networks with 
prefe rential linking of a more general type than in Sec. 

ExaI. 



new node 




FIG. 20. Scheme of the growth of the basic directed net- 
work under preferential linking mechanism. At each time step 
a new vertex and m directed edges are added. Their source 
ends may be anywhere. The target ends of these edges are 
attached to vertices of the network according to the rule of 
preferential linking. 

Let us consider the following network with directed 
edges (see Fig. |2(]). We will discuss here the in-degree 
distribution, so that we use, for brevity, the notations 
q(s,t) = ki{s,t) and 7 instead 7*. 

(i) At each time step, a new vertex is added to the 
network. 

(ii) Simultaneously, m new directed edges going out 
of non-specified vertices or even from the outside of the 
network appeared. 

(iii) Target ends of the new edges are distributed 
among vertices according to the following rule. The prob- 
ability that a new edge points to some vertex s is pro- 
portional to q(s) + A. 

The parameter A = ma plays the role of additional 
attractiveness of vertices. The resulting in-degree distri- 
bution does not depend on the place from which new 
edges go out. If, in particular, each new vertex is the 
source of all the m new edges (see a citation graph in 
Fig. |^), then k(s,t) = q(s,t) +m, and the degree of each 
vertex is fixed by its in-degree. If, in addition, we set 
A = to, i.e., a = 1, then new edges are distributed with 
probability proportional to k(s,t) 1 and we come to the 
BA model. 

Let us discuss the general case. The structure of the 
master equation for the in-degree distribution of individ- 
ual vertices, p(q, s, t), may be understood from the follow- 
ing. The probability that a new edge comes to a vertex 
s equals [q{s, t) + am]/[(l + a)mt\. Here, a = A/m. The 
probability that a vertex s receives exactly I new edges 
of the m injected is 



■p{ml) 



q(s, t) + am 



1 1 



(1 + a)mt 



1 - 



q(s,t) 



am 



(1 + a)mt 



ra—l 



(23) 



Hence, the in-degree distribution of an individual ver- 
tex of the large network under consideration obeys the 
following master equation, 



( g ,M + l) = E^ rai) P(9-!,M) = E 



P 



1=0 



1=0 



CD 


q — I + am 


1 




(1 + a)mt 





1 - 



q — I + am 



m—l 



p(q - l,s,t) . 



(24) 



(1 + a)mt 

Vertices of this simple network are born without incoming edges, so the boundary condition for this equation is 
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p(q,t,i) — Sg t o, where Sij is the Kronecker symbol. The initial condition is fixed by the initial configuration of the 
network. Summing up Eq. (^) over s, at long times, one gets the difference-differential equation 

dP 

(1 + a)t— (q, t) + (1 + a)P{q, t) + {q + am)P(q, t)-(q-l + am)P(q - 1, t) = (1 + a)S q . . (25) 

Excluding from it the term with the derivative, we obtain the equation for the stationary in-degree distribution 
P (q) — P (q, t — > oo), that is, for the in-degree distribution of the infinitely large network. (In fact, we have assumed 
that this stationary distribution exists. In the situation that we consider, this assumption is quite reasonable.) 



One may check by direct substitution that the exa ct 
solution of the stationary equation is of the form [169] 



(i 



r[l + (m + l)a] 



T(ma) T[q + 2+ (m + l)a] 



(26) 



Here, T( ) is the gamma-function. In particular, when 
a = 1, that corresponds to the BA model |5j|, we get the 
expression 



P(q) = 



2m(m + 1) 



(q + m)(q + m + l)(q + m + 2) 



(27) 



To get the degree distribution of the BA model, one has 
only to substitute the degree k instead of q + m into Eq. 
(p7|). Henc e the continuum approximation introduced in 
Sec. IX A indeed produced the proper value 3 of the 



exponent of this distribution. 



For q + ma ^ 1, the stationary distribution ( p6| ) takes 
the asymptotic form: 



P(q) oc (q + ma)-( 2+a ^ 



(28) 



Therefore, the scaling exponent 7 of the distribution de- 
pends on the additional attractiveness in the following 
way: 



7 



A/m . 



(29) 



Since A > 0, 7 varies between 2 and 00. This range of 
the 7 exponent values is natural for networks with con- 
stant average degree. In such a case, the first moment of 
the degree distrib ution must be finite, so that 7 > 2 (see 
discussion in Sec. IX J ). 

For this network, one may also find the in-degree distri- 
bution of individual vertices. At long times, the equation 
for it follows from Eq. (p4[), 



p(q,s,t + 1) = 



1 - 



q + am 
(l + a)t 



q — 1 + am . _ / p 

s ' *) + — I Iw P(g - !' s > *) + ° ( 72 



(l + a)t 



(30) 



Assuming that the scale of time variation is much larger than 1, at long times (large sizes of the network) we can 
replace the finite ^-difference with a derivative: 



dp 

(1 + s > t) = (q - 1 + am)p(q - 1, s, t)-(q + am)p(q, s, t) . 



(31) 



The solution of Eq. (|3l|), i.e., the in-degree distribution 
of individual vertices, is 



p(q,s,t) 



T(am + q) f S \ am /( 1 + a ) 
T(am)q 



Hi) 



(?) 



l/(l+a) 



(32) 



Hence, this distribution has an exponential tail. One 
may also get the expression for the average in-degree of 
a given vertex: 



9=0 



q( s ,t) = z2qp(q,s,) = am {-) -1 



(33) 



Unlike a weak logarithmic divergence of average degree 
for oldest vertices of the exponential network (see Eq. 



(|l8|)), here, at fixed time t, the average in-degree of an 
old vertex s <C t diverges as s _/3 , where the exponent 
(3 = 1/(1 + a). One sees that for the BA model, = 1/2. 
The average degree of the oldest vertices is the highest, 
so the rule "the oldest is the richest" is certainly ful- 
filled here. The singularity is strong, so the effect is 
pronounced. From Eqs. (|2^) and (|33|), we obtain the 
following relation between the exponents of the network 



0(7 - 1) = 1 : 



(34) 



We will show in Sec. pXD| that the relation, Eq. (|3J), 
is universal for scale-free networks and can be obtained 
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from the general considerations (nevertheless, see discus- 
sion of a particular case of violation of this relation in 
Sec. pCKp . 

In the scaling limit, when q,s,t — > oo, s <C t, and 
the scaling variable £ = q(sft) 13 is fixed, the in-degree 
distribution, Eq. (§52), takes the form 



(35) 



p(q,s,t) = (rj f 

scaling 

/(0 = 



where the scaling function is 
1 



r" _1 exp(-0- 



(36) 



r(am) 

Note that a particular form of the scaling function is 
model-dependent. 



We introduce the growing network with undirected 
edges (see Fig. ^l]). Initially (t = 2), three vertices are 
present, s = 0, 1, 2, each with degree 2. 

(i) At each increment of time, a new vertex is added. 

(ii) It is connected to both ends of a randomly chosen 
edge by two undirected edges. 

The preferential linking arises in this simple model not 
because of some special rule including a function of vertex 
degree as in Refs. [55 169 but quite naturally. Indeed, in 



the model that we consider here, the probability that a 
vertex has a randomly chosen edge attached to it is equal 
to the ratio of the degree k of the vertex and the total 
number of edges, 2t — 1. Therefore, the evolution of the 
network is described by the following master equation for 
the degree distribution of individual vertices, 



C. A simple model of scale- free networks 



The results of Sees. IX A and IX B were obtained for 



large networks. Let us discuss a simple scale- free growing 
net for which exact answers may be obtained for an arbi- 
trary size, without passing to the limit of large networks 
W 




FIG. 21. Illu strat ion of a simple model of a scale-free 
growing network [166]. In the initial configuration, t = 2, 
three vertices are present, s = 0, 1,2 (a). At each increment 
of time, a new vertex with two edges is added. These edges 
are attached to the ends of a randomly chosen edge of the 
network. 



p(k,S,t + l) = 2 t _\ p( k 



. 2t- 1 - k „ 
X ' s ' *) H — i — P\ k ' s ' *)) 



It - 1 



(37) 



with the initial condition, p[k, s — {0, 1, 2}, t = 2) = Sk.2- 
Also, p(k,t,t) — Sk.2- This master equation and all the 
following ones in this subsection are exact for all t > 2. 
Eq. (B7j) has a form similar to that of the BA model, Eq. 
(pCf). Therefore, the scaling exponents of these models 
have to coincide. 

From Eq. (|3?]), there follows a number of exact rela- 
tions for this model. In particular, from Eq. (|37]), one 
may find the equation for the average degree of an indi- 



vidual vertex, k(s, t) = Y^k=2 +2 kp{k> s > t) : 



k(s,t+l) 



k(s,t) 



2t- 1 



k{t,t) 



with the following solution: 



k{s,t) = 2 



t-s+ 



x (t - 1)1 (2s - 3)!! // 
(a-l)!(2t-3)H 



(38) 



(39) 



Here, s > 2 and fc(Q,t) = k(l,t) = k(2,t). Hence, 
the scaling exponent f3, defined through the relation 
k(s,t) oc (s/t)~P, equals 1/2 as for the BA model. 

The scaling form of p(k, s, t) for k, s,t 1 and kysjt 
fixed is 



P (k,s,t) = JUkJ S 1 )exp(-kJ S 1 



(40) 



(compare with Eqs. (B5J) and (pq)) 



The matter of interest is the total degree distribution, P(k, t) = ^2 s=0 p(k, s, t)/(t + 1). The equation for it follows 
fromEq. ©, 



P(M) 



t + i 



2t 



;P{k- l,t- 1) 



2t 



P(k,t- 1) 



+ 



1 

TTT 



>k.2 



(41) 
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with the initial condition P(k, 2) = 5k,2- 



In the limit of the large network size, t — > oo, P(k,i) 
approaches a stationary degree distribution P(k) which 
is very similar to the degree distribution of the B A model, 



P(k) 



12 



k(k + l)(fc + 2) 



(42) 



that is, 7 = 3. 

How does the degree distribution approach this sta- 
tionary limit? We do not wri te down the cumbersome 
exact solution of Eq. ( |4l| ) [166] but only write its scaling 
form for large k and long time t with kj \ft fixed: 



P(k,t) = P(k) 



1 k 2 
IT 



1 



exp 



IP 

IT 



(43) 



The factor P(k,t)/ P(k) = g{k/yt) depends only on the 
combination kj\ft. Therefore, the peculiarities of the 
distribution induced by the size effects never disappear 
but only move with increasing time in the direction of 
large degree. The function g{kj\f€) is shown in Fig. 
|22| . Thus, the power-law dependence of the degree dis- 
tribution of the finite size network is observable only in 
a rather narrow region, 1 <C J; « s/t. The cut-off at 
k C ut ~ Vi = i 1 /^ -1 ) and the hump impede observation 
of scale-free behavior. 




FIG. 22. Deviation of the degree distribution 

of the finite-size network from the stationary one, 
P(k,t)/P(k,t — > oo), vs. k/yi. The form of the hump de- 
pends on the initial configuration of the network. 

One can check that the form of the hump in Fig. [2^ de- 
pends on the initial conditions. In our case, the evolution 
starts from the configuration shown in Fig. |22|,a. If the 
growth starts from another configuration, the form of the 
hump will be different. Note that this trace of the initial 
conditions is visible at any size of the network. Similar 
humps (or peak s) a t the cut-off position were also ob- 
served recently [172] in the non-stati onary distributions 
of the Simon model 
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|6JJ6J (see Sec. 
In Sec. |VCg we have mentioned a kind of bipartite 
sub-graphs (bipartite cliques) which are used for index- 
ing of cyber-communitics in a large directed graph — the 
WWW. In such a bipartite sub-graph, all ha directed 
edges connecting h hubs to a authorities are present (see 
Fig. 0). One may easily check that in a large equilib- 
rium random graph the total number of these bipartite 
sub-graphs is negligible [ 1 73 1 . This is not the case for 
the growing networks. In the model under discussion, 
the statistics of the bipartite sub-graphs is very simple, 
so that we use this model as an illustrating example. 

Let us slightly modify the model to get a directed 
network. For this, let new edges be directed from new 
vertices to old ones. The possible number of author- 
ities in the bipartite subgraphs of our graph is fixed: 
a = 2. Each pair of nearest neighbor vertices plays 
the role of authorities of a bipartite sub-graph based 
on them. At each time step, a new hub (a new vertex) 
is added to a randomly chosen bipartite clique and two 
new cliques (two new edges) emerge. The total number 
N b (h, a = 2, t) = N b (h, t) of bipartite sub-graphs with h 
hubs in the network in time t satisfy the following simple 
equation 

N b (h, t + 1) = 2S h . + N b {h, t) + ~N b (h - 1, t) - - t N b {h, t) . 

(44) 

The first term on the right-hand part of Eq. (Q) is a 
contribution from two new edges, the third and fourth 
terms are due to addition of a new hub, that is, a new 
vertex, to the network. 

The probability that a randomly chosen vertex be- 
longs to the bipartite sub-graphs with h hubs is G(h, t) — 
N b (h,t)/t. From Eq. @ we have 

(t + l)G(h, t + l)- tG(h, t) = 2<5 M - G(h, t) + G(h -l,t). 

(45) 

Its stationary solution G(h) = G(h,t — > oo) is G(h) = 
2~ h . Hence, the total number of the bipartite sub-graphs 
with h hubs in the large network is large (proportional 
to t) and decreases exponentially as h grows: 
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N b (h,t) 



(46) 



This result agrees with an estimate made in Rcf. [173| for 
citation graphs growing under mechanism of preferential 
linking and correlates with the measurements of 
the distri bution of the bipartite subgraphs in the WWW 
(see Sec. |yC2[ ). 

Note that the above model is very close to the net- 
work growing under mechanism of l inking to triples of 
the nearest neighbor nodes (see Sec. VII C ). The degree 
distributions of these networks are very similar (7 = 3). 
Both these scale-free networks have large clustering co- 
efficients. 



tribution P cum {kj) oc k - ln3 / ln2 (x fcj 
points of the discrete degree spectrum. Then we obtain 



Here kj are 



7=1 



In 3 
m~2 



> 2. 



(47) 



Compare this expression with the exponent (the fractal 
dimension) in the relation between the "mass" and the 
perimeter of the graph. Also, notice that the maximal 
degree of a vertex is k cut = 2* +1 - jv t ln2/ln3 = n] /{i ~ 1] . 
Other deterministic versions of the same simple model 
produce various discrete distributions (exponential and 
others). 




D. Scaling relations and cutoff 



FIG. 23. A simple deterministic growing graph. At time 
t = 0, the graph is a triangle. At each time step every edge 
of the graph generates a new vertex which connects to both 
ends of the edge. 

One may slightly modify the model under considera- 
tion and obtain a deterministic growing graph which has 
a discrete spectrum of degrees. "Scale-fre e" n etworks of 
this kind were recently proposed in Ref. [174]. At each 
time step, let every edge of the graph generate a new 
vertex which connects to both ends of the edge (see Fig. 
p3| ). The growth starts from a triangle (t = 0). Then 
the total number of vertices at time t is N t — 3(3* + l)/2 
and the total number of edges is L t — 3* +1 , so that the 
average degree k t = 4/(l + 3~*) approaches 4 in the large- 
graph limit. The "perimeter" of the graph (see Fig. |23|) 



is P, = 3 x 2', hence N f 



p^ln3/ln2 w Jj en £ j g l ar g e 

The clustering coefficient of the graph is large: C — > 4/5 
as t — ► 00. 

The spectrum of degrees of the graph is dis- 
crete: at time t, the number n(k,t) of vertices 
of degree k = 2, 2 2 , 2 3 , . . . , 2*" 1 , 2', 2 t+1 is equal to 
3', 3 t_1 , 3 t_2 , . . . , 3 2 , 3, 3, respectively. Other values of 
degree are absent in the spectrum. Clearly, for the large 
network, n(k,t) decreases as a power law of A;, so the net- 
work may be called "scale-free" . It is easy to introduce 
the exponent 7 for this discrete situation where degree 
points are inhomogeneously spread over the k axis. For 
this one may calculate the corresponding cumulative dis- 



In Sees. KB and IXC we found that a number of 



quantities of particular scale-free networks may be writ- 
ten in a scaling form, and the scaling exponents involved 
are connected by a simple relation. Can these forms and 
relations be applied to all scale-free networks? 

Let us proceed with general considerations. In this 
subsection, it is not essential, whether we consider de- 
gree, in-degree, or out-degree. Hence we use one general 
notation, k. When one speaks about scaling properties, 
a continuum treatment is sufficient, so that we can use 
the following expressions 



and 



1 r* 

P(k,t) = - dsp{k,s,t) 

t Jtn 



k(s,t) = / dkkp(k,s,t). 
Jo 



(48) 



(49) 



In addition, we will need the normalization condition for 

p(k,s,t), 



dkp(k, s, t) = 1 



(50) 



If the stationary distribution exists, than from Eq. 
(|48|), it follows that p(k,s,t) has to be of the form 
p[k,s,t) = p(k, s/t). From the normalization condi- 
tion, Eq. (50), we get dk p(k,x) = 1, so p(k,x) = 
g(x)f(kg(x)), where g(x) and f{x) are arbitrary func- 
tions. 

Let us assume that the stationary distribution P(k) 
and the average degree k(s,t) exhibit scaling behavior, 
that is, P(k) oc fc -7 for large k and k(s,t) oc s^ 13 for 
1 <C s <C t. Then, from Eq. (|49|), one sees that 
Jq 00 dk kp(k,x) oc x~@. Substituting p(k,x) into this re- 
lation, one obtains g(x) (x x@ . Of course, without loss 
of generality, one may set g(x) = x 13 , so that we obtain 
the following scaling form of the degree distribution of 
individual vertices, 



p(k, S ,t) = ( S /tff(k( S /tf) 



(51) 
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Finally, assuming the scaling behavior of P(k), i.e., 
dx p(k,x) oc fc -7 , and using Eq. dsif), we obtain 
7 = 1 + 1/(3, i.e., relation ( |34| ) between the exponents 
is universal for scale- free networks. Here we used the 
rapid convergence of p(k, x) at large x (see Eqs. (|3^) and 
(|40|)). One should note that in this derivation we did not 
use any approximation. 

The relation between the 7 and /3 exponents looks pre- 
cisely the same as that for the 7 exponent of the degree 
distribution and the corresponding exponent of Zipf's 
law, v. One can easily understand the reason of this 
coincidence. Recall that, in Zipf's law, the following de- 
pendence is considered: k = f(r). Here r is the rank of a 
vertex of the degree fc, i.e., r oc j?° dk P(k) = P CU m(k). 
If Zipf's law is valid, k oc r~ v , then r oc kT x l v oc fc~ 7+1 , 
and we get 7 = 1 + 1 /v. Therefore, the f3 exponent equals 
the exponent of the Zipf's law, j3 = v. 

Now we can discuss the size-effects in growing scale- 
free networks. Accounting for the rapid decrease of the 
function f(z) in Eq. (|3^), one sees that the power-law 
dependence of the total degree distribution has a cut-off 
at the characteristic value, 



No scale-free networks with large values of 7 were ob- 
served. The reason for this is clear. Indeed, the power- 
law dependence of the degree distribution can be ob- 
served only if it exists for at least 2 or 3 decades of degree. 
For this, the networks have to be large: their size should 
be, at least, t > lO 2 5 ^ 7-1 -*. Then, if 7 is large, one prac- 
tically has no chances to find the scale-free behavior. 

In Fig. |24|, in the log-linear scale, we present the val- 
ues of the 7 exponents of all the networks reported as 
having power-law degree distributions vs. their sizes (see 
also Tab. ||). One sees that almost all the plotted points 
are inside of the region restricted by the lines: 7 = 2, 
log 10 i ~ 2.5(7 — 1)) an d by the logarithm of the size of 
the largest scale-free network - the World-Wide Web - 
logio* ~ 9. 

In a similar way, we obtain the following general form 
of P(k, t) for scale-free networks in the scaling regime: 



P(k,t) = k-iF(kt- p ) = k-~>F{kt 



-1/(7-1) 



(53) 



Here F(x) is a scaling function. We have o btain ed this 
form for an exactly solvable network in Sec. [X C . 



f/W/(7-D. 



(52) 



In fact, k cut is the generic scale of all "scale- free" net- 
works. It also follows from the condition t J k t dkP(k) ~ 
1, i.e., t ff* 3 dkk 1 ~ 1. This means that only one ver- 
tex in a network has degree above the cutoff. A more 
precise estimate is k/ko ~ i 1 /^ 7-1 ), where fco is the lower 
boundary of the power-law region of the degree distribu- 
tion. Eq. ( p^ ) can be used to estimate the 7 exponent if 
the maximal degree in a network is known from empirical 
data We have already applied Eq. @ in Sec. 

[y| to check the quality of reported values of some real 
networks. 



We have shown (see Sec. IXC) that a trace of initial 
conditions at k ~ k cut may be visible in a degree distribu- 



tion measured for any network size [166|. The cutoff (and 



the trace of initial conditions) sets strong restrictions for 
observations of power-law distributions since there are 
few really large networks in Nature. 

In fact, measurement of degree distributions is always 
hindered by strong fluctuations at large k. The reason 
of such fluctuations is the poor statistics in this region. 
One can easily estimate the characteristic value of de- 
gree, kf, above which the fluctuations are strong. If 
P(k) ~ /c~ 7 , tkj 1 ~ 1. Therefore, k f ~ t 1 ^ . One may 
improve the situation using the cumulative distributions, 
Pcum(k) = dkP(k), instead of P(k). Also, in simula- 
tions, one may make a lot of runs to improve the statis- 
tics. Nevertheless, one can not exceed the cut-off, k cut , 
that we discuss. This cut-off is the real barrier for the ob- 
servation of the power-law dependence. (One should note 
that accounting for the aging of nodes, break of links, or 
disappearance of nodes suppresses the effect of t he initial 
conditions and removes the hump [171, 175, 176 1.) 
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network or subgraph 


number of vertices 


number of edges 




Refs. 


complete map of the nd.edu domain of the Web 


325, 729 


1,469,680 


7* = 2.1 

~, OAK 

7o — Z.40 


i 


1 




pages of World Wide Web scanned by Altavista 
in October of 1999 


0711 1 

Z. ( 11 1U 


Z.lou 1U 


7» — Z.l 

70 = 2.7 


m 


" — — " (another fitting of the same data) 






7> = 2.10 
7o = 2.82 


m 


domain level of the WWW in spring 1997 


2.60 10 5 





ji = 1.94 
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inter-domain level of the Internet in December 1998 


4389 


8256 


2.2 


i 


1 




net of operating "autonomous systems" in Internet 


6374 


13641 


2.2 




router level of the Internet in 1995 


3888 


5012 


2.5 


i 


1 




router level of the Internet in 2000 2 


~ 150, 000 


~ 200, 000 


~ 2.3 


1 


10 




citations of the ISI database 1981 - June 1997 


783, 339 


6, 716, 198 


7, = 3.0 






" — — " (another fitting of the same data) 






7. = 2.9 
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" — — " (another estimate from the same data) 






7* = 2.5 




34 


@ 


citations of the Phys. Rev. D 11-50 (1975-1994) 


24, 296 


351, 872 


7* = 3.0 
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" — — " (another fitting of the same data) 






7. = 2.6 




i 




" — — " (another estimate from the same data) 






7* = 2.3 




u 


@ 


citations of the Phys. Rev. D (1982-June 1997) 


— 


— 


7* = 1.9 




10C] 


collaboration network of movie actors 


212,250 


61,085,555 


2.3 


n 




" — — " (another fitting of the same data) 






3.1 




102] 


collaboration network of MEDLINE 


1,388,989 


1.028 10 r 


2.5 


i 


§ 




collaboration net collected from mathematical journals 


70, 975 


0.132 x 10 6 


2.1 


i 


15 




collaboration net collected from neuro-science journals 


209, 293 


1.214 x 10 6 


2.4 


i 






networks of metabolic reactions 


~ 500 - 800 


~ 1500 - 3000 


7* = 2.2 
7o = 2.2 


[§ 


net of protein-protein interactions (yeast proteome) 3 


1870 


2240 


~ 2.5 


l 


w 


m 


wuiq weu 


470 nnn 

*± l u, uuu 


1 7 nnn nnn 

1 / , uuu, uuu 


1 K 
1.0 




126] 


digital electronic circuits 


2 X 10 4 


4 X 10 4 


3.0 




128] 


telephone call graph 5 


47 x 10 6 


8 x 10 7 


7. = 2.1 


[§2 




web of human sexual contacts 6 


2810 




3.4 




132] 


food webs 7 


93 - 154 


405 - 366 


~ 1 




18 


§ 
















TABLE I. Sizes and values of the 7 exponent of the networks or subgraphs 


reported as having power- law (in-, 


out-) de£ 


;ree 



distributions. For each network (or class of networks) data are presented in more or less historical order, so that the recent 
exciting progress is visible. Errors are not shown (see the caption of Fig. E3). They depen d on the size of a network and on 



the value of 7. We recommend our readers to look at the remark at the end of Sec. V C 2 before using these values 

data for the network of operating AS was obtaine d fo r one of days in December 1999 

3 



1 The 

The value of the 7 exponent was 
The network of protein-protein interaction is treated as undirected. 

5 The 



estimated from the degree distribution plot in Ref. [ 104 

4 The value of the 7 exponent for the word web is given for the range of degrees below the crossover point (see Fig 
out-degree distribution of the telephone call graph cannot be fitted by a power- law dependence (notice the remark in Sec. VF). 
6 In fact, the data was collected from a small set of vertices of the web of human sexual contacts. These vertices almost surely 
have no connections between them. 7 These food webs are truly small. In Refs. |5^,^l| degree distributions of such food webs 
were interpreted as exponential-like. 
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FIG. 24. Log-linear plot of the 7 exponents of all the networks reported as having power-law (in-, out-) degree distributions 
(i.e., scale-free networks) vs. their sizes. The line 7 ~ l+log 10 t/2.5 is the estimate of the finite-size boundary for the observation 
of the power-law degree distributions for 7 > 2. Here 2.5 is the range of degrees ( orders ) which we believe is necessary to 
observe a power law. The dashed line, 7 = 3, is the resilience boundary (see Sec. XI D). This boundary is important for 
networks which must be stable to random breakdowns. The points are plotted using the data from Tab. |. Points for j and 
7i from the same set of data are connected. The precision of the right points is about ±0.1 (?) and is much worse for points 
in the grey region. There exists a chance that some of these nets are actually not in the class of scale-free networks. The 
points: li and lo are obtained from in- and out-degree distributions of the complete map of the nd.edu domain of the WWW 
; li' and lo' are from in- and out-degree distributions of the pages of the WWW scanned by Altavista in October of 1999 
po[ ; lo" is the 7 value from another fitting of the same data plf ; li'" is 74 for domain level of the WWW in spring 
1997 117 1); 2 is 7 for the inter-domain level of the Internet in December 1998 |H|; 2' is 7 for the network of operating AS 
in one of days in December 1999 |)6j ; 3 is 7 for the router level of the Internet in 1995 [g] ; 3' is 7 for the router level of 
the Internet in 2000 |l04[; 4z is 7, for citations of the ISI database 1981 - June 1997 JE7J; 4i' is the result of the different 
fitting of the same data psj] ; 4i" is another estimate obtained from the same data p4] , ^)5| ; 4j is 7, for citations of the Phys. 
Rev. D 11-50 (1975-1994) 4f is the different fitting of the same da ta |9 9j1; 4j is another estimate from the same 

data [^4^95]] ; 4j"' is 7* for citations of the Phys. Rev. D (1982- June 1997) flOCf ; 5a is the 7 exponent for the collaboration 
network of movie actors J55|; 5a' is the result of another fitting for the same data 102 1; 56 is 7 for the collaboration network 
of MEDLINE jl3| ; 56' is 7 for the collaboration net collected from mathematical journals jl3] ; 56" is 7 for the collaboration 
net collected from neuro-science journals J^] ; 6io is 7, = 70 for networks of metabolic reactions ; 7 is 7 of the network of 
protein-protein interactions (yeast proteom e) if it is treated as undirected ^Jij) ; 8 is 7 of the degree distribution of the word 
web in the range below the crossover point [ 126 1 ; 9 is 7 of large digital electronic circuits [ |l28| ; 10 is 7; of the telephone call 
graph (the out-degree dist ribut ion of this graph cannot be fitted by a power-law dependence); 11 is 7 of vertices in the 



web of human sexual contacts [ 132 1 
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E. Continuum approach 



Its general solution is 



As we have already seen in Sec. IX A, the continuum 



approximation produces the exact value of 7 for the BA 
model. The first results for the exponents [EBl were ob- 
tained just using this approximation (in Rcfs. J5f|^6| it 
was called "mean field" ) . Such an approach gives the ex- 
act values of the exponents for numerous models of grow- 
ing scale- free networks and allows us to desc ribe easily 
main features of the network growth [171,175]. 

Let us briefly describe this simple technique. Passing 
to the continuum limits of k and t in any of written above 
master equations for the degree distributions of individ- 
ual vertices (e.g., in Eq. ( |Io| ) for the exponential network 
or Eq. (20) for the BA model) we get the linear partial 
differential equations of the first order which have the 
following solution 



p(k, s, t) = S(k — k{s. t)) . 



(54) 



Of course, the form of this solution is rather far from 
the solutions of the corresponding exact master equa- 
tions. Nevertheless, this ^-function ansatz works ef- 
fectively both for exponential and scale-free networks 
[ i7lip75| . 

One may even not use master equations but proceed 
in the following way. In the simplest example, the BA 
model with one vertex and one edge added at each time 
step, the ansatz ( |5^ ) immediately leads to the equation 
for the average degree of vertices: 



dk(s,t) 



k(s,t) 



dt f*duk(u,t) 



(55) 



Equation (B5t) also follows from the continuum limit of 



the master equation for p(k, s, t) of this model, Eq. (20). 
It has a simple meaning - new edges are distributed 
among vertices proportionally to their degrees as it is 
fixed by the rule of preferential linking. The initial con- 
dition is fc(0,0) = 0, and the boundary one, k(t,t) = 1. 
One sees that Eq. (|5^) is consistent. Indeed, applying 
/„ ds to Eq. (HI we obtain 



d_ 
dt 



ds k(s, t) 







ds —k(s, t) 
dt v ' 



k(t,t) 



from which the proper relation follows, 

r-l 



ds k(s, t)=2t. 



1 + 1, 
(56) 

(57) 
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that is, the total degree in this case equals double the 
number of edges. Therefore, Eq. (J5q) takes the form 



(58) 



dk(s,t) _ 1 k(s,t) 
dt ~ 2 t 



k(s,t) = C(s)t 1/2 , 



(59) 



where C(s) is arbitrary function of s. Accounting for the 
boundary condition, k(t,t) = 1, one has 



*(-,*) = (f) 



-1/2 



(60) 



Hence, the scaling exponent (3 equals 1/2, as we have 
seen before. 

In the continuum approach, the expression for the total 
degree distribution is of the form 



1 



P{k,t) = - / dsS(k-k(s,t)) 



1 fdk(s,t) 



t V ds 



s(k,t)], 



(61) 



where s(fc, t) is a solution of the equation, k = k(s, t). Us- 
ing Eq. ( |6l| ) , one may immediately reproduce the scaling 
relation between the exponents, so 7 = 1 + 1/(3. There- 
fore, in the present case, 7 = 3. 



F. More complex models and estimates for the 
WWW 



O ne m ay consider more complex growing networks 



[175,176|. We will demonstrate that scale- free nets may 
be obtained even without "pure" preferential linking. It 
is convenient to consider incoming edges here, so we use 
the following notation for in-degree, q = ki. 




new node 



FIG. 25. Scheme of the growth of the network with a mix- 
ture of the preferential and random linking (compare with the 
schematic Fig. |H| for the WWW growth) . At each time step, a 
new vertex with n incoming edges is added. Simultaneously, 
the target ends of m new edges are distributed among vertices 
according to a rule of preferential linking, and, in addition, the 
target ends of n r new edges are attached to randomly chosen 
vertices. The source ends of each edge may be anywhere. 
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Let us describe the model (see Fig. |25| ): 

(i) At each time step, a new vertex is added. 

(ii) It has n incoming edges which go out from arbi- 
trary vertices or even from some external source. 

(iii) Simultaneously, m extra edges are distributed with 
preference. This means that they go out from non- 
specified vertices or from an external source but a target 
end of each of them is attached to a vertex chosen pref- 
erentially - probability to choose some particular vertex 
is proportional to q + A. A is a constant which we call 
additional attractiveness (see Sec. IX B). We shall see 
that its reasonable values are A > — n — n r . 

(iv) In addition, at each time step, the target ends of 
n r edges are distributed among vertices randomly, with- 
out any preference. Again, these edges may go out from 
anywhere. 

In the continuum approach, one can assume that m 
and n are not necessarily integer numbers but are posi- 
tive. Note that here we do not include into consideration 
the source ends of edges, since we are studying only in- 
degree distributions. 

The equation for the average in-degree of vertices in 
this network has the form, 



dq(s,t) n r 



q{s,t)+A 
f*du[q(u,t)+A] 



(62) 



with the initial condition, q(Q, 0) = 0, and the bound- 
ary one, q(t, t) = n. The first term on the right-hand 
side accounts for linking without preference, the sec- 
ond one - for the preferential linking. In this case, 
J * dsq(s,t) = (n r + m + n)t. It follows from Eq. J62] ) 
that P = mj (m + n r +n + A), so < (3 < 1, and 



A 



7» 



(63) 



Thus, the additional fraction of randomly distributed 
edges does not suppress the power-law dependence of the 
degree distributions but only increases 7j which is in the 
range between 2 and infinity. 

This model allows one to obtain some estimates for 
the exp onents of in- and out-degree distributions of the 
WWW |p77| , |l78| . Let us discuss, first, the in-degree dis- 
tribution. We have alread y expl ained how new pages ap- 
pear in the Web (see Sec. V C 2). The introduced model, 
at least, resembles this process. The problem is that we 
do not know the values of the quantities on the left-hand 
side of Eq. ©. 

The constant A may take any values between — (n r -\-n) 
and infinity, the number of the randomly distributed 
edges, n r , in principle, may be not small (there exist 
many individuals making their references practically at 
random), and n i s not fixed. From the experimental 
data §] (see Sec. |VC2D we know more or less the sum 
m + n + n r ^ 10 ^> 1 (between 7 and 10, more precisely), 
and that is all. 

The only thing we can do, is to fix the scales of the 
quantities. The natural characteristic values for n r +n+A 



in Eq. ( |63j ) are (a) 0, (b) 1, (c) m 3> 1, and (d) infinity. 
In the first case, all new edges are attached to the oldest 
vertex since only this one is attractive for linking, and 
7, — * 2. In the last case, there is no preferential link- 
ing, and the network is not scale- free, ji — + oo. Let us 
consider the truly important cases (b) and (c). 

(b) Let us assume that the process of the appearance 
of each document in the Web is as simple as the proce- 
dure of the cr eation of your personal home page described 
in Sec. V C 2 . If only one reference to the new document 
(n = 1) appears, and if one forgets about the terms n r 
and A in Eq. (|63|), than, for the 7, exponent of the 
in-degree distribution, we immediately get the estimate 
7i — 2 ~ 1/m ~ 10 _1 . This estimate indeed coincides 
with the experimental value 7* — 2 = 0.1 |(| (see Sec. 
VC2). Therefore, the estimation looks good. Neverthe- 
less, we should repeat, that this estimate follows only 
from the fixation of the scales of the involved quantities, 
and many real processes are not accounted for in it. 

(c) Above we discussed the distribution of incoming 
links. Eq. ([33]) may also be applied for the distribu- 
tion of links which go out from documents of the Web, 
since the model of the previous section can easily be re- 
formulated for outgoing edges of vertices. In this case 
all the quantities in Eq. (|6^) take other values which 
are again unknown. Howeve r, we can estimate them. As 
we explained in Sec. VC2, there are usually several ci- 
tations (n) in each new WWW document. In addition, 
one may think that the number of the links distributing 
without any preference, n r , is not small now. Indeed, 
even beginners proceed by linking of their pages. Hence, 
n + n r ~ m — we have no other available scale, — and 
7o — 2 ~ m/m ~ 1. We can compare this estimate with 
the experimental value, 7 — 2 = 0.7 |2|||. 

Unfortunately, numerous channels of linking make sim- 
ilar contributions to the values of the exponents of the 
degree-distributions, so quite "honest" estimates are im- 
possible. Let us introduce the "general" model of a grow- 
ing directed network. In this model we account for the 
main channels of linking which yield contributions of the 
same order to 7, and j . This will demonstrate the com- 
plexity of the problem. 



new node 
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FIG. 26. Scheme of the growth of the generalized model 
of a directed network. (i) At each time step, a new vertex 
is added. (ii) It has both outgoing and incoming edges. 
The target ends of r a outgoing edges are distributed ran- 
domly among old vertices. The source ends of r< incoming 
edges are distributed randomly among old vertices. The 
target ends of p outgoing edges are distributed preferentially 
among old vertices with probability proportional to ki + A%. 
The source ends of pi incoming edges are distributed pref- 
erentially among old vertices with probability proportional 
to k + A . (iii) Simultaneously, p' edges are distributed 
preferentially between old vertices. Their target ends are 
distributed with the preference function, ki + A' t and their 
source ends - with the preference function k + A' a . (iv) In 
addition, r 1 edges are distributed without preference among 
old vertices. (v) In addition, p" connections appear be- 
tween old vertices with source ends being distributed without 
preference and with target ends - with the preference func- 
tion ki + A" . Finally, p" edges emerge between old ver- 
tices with target ends being distributed without preference 
and with source ones - with the preference function ki + A" . 
Here At, A , A' t , A' a , A" , and A" are constants. The total 
number of connections that emerge at each increment of time 
is n t = Po + Pi + r a + n + r' + p' + p" + p". 

The number of possible channels is so large that we 
have to introduce new notations. The network grows by 
the rules described in the caption of Fig. 
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We ac- 
count for all combinations of linking without preference 
and linear preferential linking. Some new edges appear 
between new and old vertices, other connect pairs of old 
vertices. For different channels of linking, parameters of 
preferential linking differ from each other. Additional at- 
tractiveness takes different values for target and source 
ends of preferentially distributed edges. For brevity, we 
use the simplest preference functions of the form ki + A; 
for distribution of target ends of links and of the form 
k Q + A Q for distribution of source ends. In fact, this 
model generalizes the known models of network s with 
preferential linking of directed edges [169 175 17S]. 

The above growing network is scale-free. Its exponents 
may be obtained in the continuum approach framework. 
Fortunately, part of parameters introduced in Fig. |2(], 
disappear from the final expressions for and 7 Q : 



7i = l + 
7o = l + 



Po 



p 



p t 



n t + A, m + A' t m + A'l 

Pi P' Po 



n t + A D n t + A' Q n t + 



(64) 



In principle, one must account for all above contribu- 
tions. One may check that Eq. (^3|) is a particu- 
lar case of Eq. (|64]). Twelve unknown parameters 
{n uVo , Vi , V ', V >>,p'l,A i ,A ,A' i ,A^X!,A" ) in Eq. (@) 
make the problem of improving of the estimate of 7i ]0 
(see (b) and (c)) hardly solvable. 
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FIG. 27. Scheme of growth of an undirected network with 
creation of connections between already existing vertices. At 
each time step, (i) a new vertex is added; (ii) it connects to m 
preferentially chosen old vertices; (iii) cm new edges connect 
pairs of preferentially chosen old vertices. 

A simpler case o f a growing undirected network was 
considered in Ref. [176|. At each time step, apart of m 
new edges between a new vertex and old vertices, mc new 
edges are created between the old vertices (see Fig. p7| ), 
so that the average degree of the network is k = 2to(1+c). 
The connections to a new vertex are distributed among 
old vertices like in Barabasi- Albert model. The proba- 
bility that a new edge is attached to existing vertices of 
degree k^ and is proportional to k^k^"'. Here fj, 
and v are labels of the vertices. The resulting degree 
distribution is of a power-law form with the exponent 

1 777 

7 = 2 + — - = 2 + = . (65) 

1 + 2c k-m 

Thus, 2 < 7 < 3. The same expression is valid if, at each 
time step, we delete —mc > randomly chosen edges 
(here c < 0). 




FIG. 28. Scheme of the growth of an undirected network 
with the rewiring of connections in the old part of the net- 
work. At each time step, (i) a new vertex is added; (ii) it 
connects to m preferentially chosen old vertices; (iii) m r old 
vertices are chosen at random, and, from each of these ver- 
tices, one of edges is rewired to another vertex. In the m rr 
cases, the rewiring occurs to randomly chosen vertices. In 



the rest m r 



m rr of cases, the rewired edge ends are 



attached to preferentially chosen vertices. 



A very similar effect produces a rewiring of edges [ 102 1 . 
Now, instead of the creation of connections in the old part 
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of an undirected growing network, at each time step, let 
each of m r randomly chosen vertices loose one of its con- 
nections (see Fig. |28|). In m rr cases, a free end is at- 
tached to a random vertex. In the rest m rp = m r — m rr 
cases, a free end is attached to a preferentially chosen 
vertex. The continuum equation for the mean degree has 
the form 



dk(s,t) 
dt 



(m + m rp ) 



k(s,t) 



L duk(u, t) 



t 



(66) 



with the boundary condition k(t, t) — m. From this, one 
gets the following expression for the exponent of the de- 
gree distribution: 



7 



in - 



(67) 



Notice that here we did not account for the emergence 
of bare vertices, so the number of the rewirings m r has 
to be small enough. From Eq. (|67|), it follows that, as 
the number of the preferential rewirings grows, the 7 ex- 
ponent decreases. Moreover, simulations in Ref. |102| 
demonstrated that, when the number of the rewirings 
is high enough, the degree distribution changes from a 
power law to an exponential one. 

One sees that the power-law in- and out-degree distri- 
butions arise from the power-law singularities ki(s,t) oc 
(s/t)^ l3i and k (s,t) oc (s/t)~P° at the point s = (the 
oldest vertex). Therefore, the same vertices, as a rule, 
have the high values of both in- and out-degree. This 
means that the in- and out-degree of vertices correlate, 
and, of course, P(ki,k ) ^ P(ki)P(k ) (see disc ussio n 
in the paper of Krapivsky, Rodgers, and Redner [179]). 
Moreover, even if we exclude the preferential linking from 
such network growth process, the rule "the oldest is the 
richest" is still valid for both in- and -out degree, and 
hence the c orre lation between ki and k a is again present. 

In Ref. [179 1, the distribution P(ki,k ) was analyti- 
cally calculated for a model of this type. To get the 
exact result, the authors of this paper accounted for 
only two channels of the preferential attachment of new 
edges and made a number of simplifying assumptions. In 
their model, (i) a new edge may go out of a new ver- 
tex and, in this case, its target end is attached to some 
old vertex chosen with the probability proportional to 
ki + Ai. (ii) Another possibility is connection of two old 
vertices (fj,) and (u) with the probability proportional to 
(fc l (M) + A'A{k ( v) + A' Q ) (in Ref. » A\ = AA. Here , 



k^J and fc^ are the in- and out-degrees of these vertices. 



In addition, parameters of the model [17E] are chosen in 
such a way that the exponents of the in- and out-degree 
distributions are equal, 7, = 7 . The resulting distribu- 
tion has the following asymptotic form for large ki and 



A' 



P(ki,k ) cx 



k o 



(ki + k ) 



2A; + 1 



(68) 



which is very different from the product P(ki)P(k ). 

A model of growing directed networks with preferen- 
tial linking was simulated in the paper ]18(| . In- and 
out-degree distributions were observed to be of power- 
law form. The distribution of the sizes of connected clus- 
ters may be also interpreted as a power-law dependence 
in some range of the parameters of this model. 



G. Types of preference providing scale-free networks 

Many efforts were made to analyse different preference 
functions producing scale-free networks. The power-law 
preference function, k v , does not produce power-law de- 
gree distributions if y 7^ 1, see Sec. g One can check 
that the necessary condition is a linear asymptotic form 
of the preference function at large values of degree |m|,[)5| , 
so the function, in principle, may be nonlinear. Never- 
theless, main features can be understood if one consider 
linear preference functions. In general, the probability 
for a new link to be attached to a vertex s at time t is 
p(s,t) = G(s,t)k(s,t) + A(s,t). The coefficient G{s,t) 
may be called fitness of a vertex [184,181] A(s, t) is addi- 
tional attractiveness. As we have seen, A can change the 
values of the exponents. Effect of the variation of G may 
be even stronger. 

One can consider the following particular cases: 

(i) G = const, A — A(s). In this case, the additional 
attractiveness A(s) may be treated as ascribed to indi- 
vidual vertices. A possible generalization is to make it a 
random quantity. One can check that the answers do not 
change crucially - one only has to substitute the average 
value, A, instead of A, into the previous expressions for 
the scaling exponents. 

Note that n and m may also be made random, and this 
can be accounted for by the substitution of n and m into 
the expressions for the exponents. 

There exists a more interesting possibility - to con- 
struct a direc t generalization of the network considered 
in Sec. IX F where combination of the preferential and 
random linking was described. For this, we may ascribe 
the additional attractiveness not to vertices but to new 
edges and again make it a random quantity. In such an 
event, new edges play the role of fans with different pas- 
sion for popularity of their idols, vertices. This is the 
case (ii), G = const, A = A(t), where A(t) is random. 
If the distribution function of A is P(A), the 7 exponent 
equals 



dAP(A) 
1 + (n + A)/m 



(69) 



see Ref. [175]. The values of the exponent are again be- 
tween 2 and 00. 

Let us pass to situations where A = const. 

(hi) G = G(t). This case reduces to case (ii). 
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(iv) G(s, t) = f(t — sV, aging of vertices. This case was 
considered in Refs. [ 171, 175 1. In particular, such a form 



of a preference function is quite reasonable in citation 
networks. Indeed, we rarely cite old papers. One may 
check that to keep the network scale- free, the function has 
to be of a power-law form, G(s, t) = (t — s)~ a . In princi- 
ple, the exponent a may be of any sign: — oo < a < oo. 
Negative values of a are typical for very conservative ci- 
tation networks (many references to Bible) . Variation of 
the aging exponent a produces quite distinct networks, 
see Fig. If a is negative, links tends to be attached 
to the oldest vertices, if a is large, the network becomes 
a chain structure. 



£(i-0 



K (l) = l. 



(70) 



From Eq. (|70[), one obtains the solution k(£, 0). Substi- 
tuting it into the right equality in Eq. (|70j) or, equiva- 
lently, into f Q d£n(Q — 2, we get a transcendental equa- 
tion for fj. The resulting exponents, < (3 < 1 and 
2 < 7 < oo, are shown in Figs. || and |l| p7l] , p75 |. 
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FIG. 30. f5 exponent of the average degree vs. the aging 
exponent a of the network with ag ing of vertices. The points 
are obtained from the simulations [ 171 . The line is the result 
of the calculations. The inset shows the analytical solution in 
the range —5 < a < 1. Note that j3 — > 1 when a — > — oo. 



oc = 2.0 a = 1 0.0 

FIG. 29. Change of the structure of the network with ag- 
ing of vertices with increase of the aging exponent a. The 
aging is proportional to r~ a , where r is the age of a vertex. 
The network grows clockwise starting from the vertex below 
on the left. At each time step, a new vertex with one edge is 
added. 



Again it is possible to use the continuum approach. 
For the undirected network to which one vertex with one 
edge is added at each time step, after the introduction 
of the scaling variables, n(s/t) = k(s,t) and £ = s/t, one 
gets 
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FIG. 31. 7 exponent of the average degree vs. the aging 
exponent a of the network with aging of vertices. The points 
show the results of simulations [171]. The line is the analyt- 
ical result. The inset depicts the analytical solution in the 
range — 5 < a < 1. 



(v) G = G(s), a fluctuating function [181]. 

Let G be a random variable having distribution P(G). 
Then, in the particular case of the BA model, one gets 
the following equation for average degree 

dfc(M) _ G(s)k(s,t) = ^^M (n] 

dt f*dsG(s)k(s,t) P{ [ 11 t [ ' 

with the boundary condition k(t, t) = 1 (we set m = 
1 without loss of generality). In this case, k(s,t) = 
(s/t)~^ G ^\ Then, it is easy to check that /3(G) = cG, 
where the constant c can be obtained from the equation 



dGP(G)- 



G 



= l/c. 



(72) 



1 - cG 

Using Eq. (|6l|), one gets the distribution 

P(k) oc J dGP(G) (\ + fc-[ 1+1 /(cG)] _ ( 73 ) 

If P(G) = S(G - Go), the network is the original BA 
model. If P(G) is a distributed function, the answer 
changes. For instance, for P{G) = 6(G)0(1 — G), i.e., 
when G is homogeneously distributed in the range (0, 1), 
the distribution takes the following form: 



i \ i--( 1 + 1 / c ) 
P(fc)oc / dG(l + 4;U- [1+1/(cG)1 « 



cG 



In k 



(74) 



Here, constant c, which is the solution of Eq. fl72|) 
with homogeneous distribution P(G), equals 0.797 . . ., so 
7 = 2.255 . . ., i.e., it is smaller than the value 7 = 3 for 
the homogeneous BA model. 

We emphasize that 7 depends on a form of the distri- 
bution P(G). In particular, results obtained with distri- 
bution P(G), whi ch co nsists of two delta-functions, are 
discussed in Sec. [XH. The fluctuations of G may also 



be in t roduc ed in to mo dels of growing networks from Sees. 
EXBL |IXE|, and |TX F|. Results are similar to Eq. @. 



A combination of fluctuating additional attractiveness 
■Ms) and fluctuating fitness G(s) was considered in Ref. 
1 1 8 2 1 . Calculations in this paper are very similar to 
the ab ove derivation (an explicit rate-equation approach 
|M,35 179 1 was used), but the result, namely the 7 ex- 
ponent of the power-law dependence, corrected by a log- 
arithmic denominator, depends on the form of the joint 
distribution P(A,G). In Ref. [ 182 1 one may also find 
the results for the exponents of in- and out-degree dis- 
tributions of a directed growing network with fluctuating 
fitness. 



One should note that most of the existing models of 
networks growing under mechanism of preferential link- 
ing produce the effect "the oldest is the richest" . (Here 
we do not dwell on situations when old vertices may die, 
divide into parts, or stop to att ach new edges. The last 
possibility was studied in Refs. [ 167, 168 1, and this is the 
case, where young vertices may have larger degrees than 
old vertices. Also, if vertices may divide into parts, it 
is hard to define the age of a vertex at large temporal 
scales.) Even in the case of fluctuating fitness G, older 
vertices are, with high probability, of larger degree than 
young vertices. Indeed, the power-law degree distribu- 
tions have to be accompanied by strong singularities of 
the average degree of individual vertices k(s,t) at s = 0. 
This follows from the derivations of scale-free degree dis- 
tributions in the present section. The fluctuations of G 
produce broadening of the degre e dist ribution of indi- 
vidual vertices p(k,s,t). In Sec. IX H we will consider 
the situation in which the in-homogeneity of G provides 
stronger effect than considered here. 



It was stated in Ref. [117] that degree distribution of 
individual vertices (sites) of the Web practically does not 
depend on their age. This indicates inapplicabili ty o f the 
preferential linking concept. Authors of Ref. [183] ex- 
plained that these data are not sufficient to exclude the 
rule "the oldest is the richest" and that just the inhomo- 
geneity of fitness G hampers the observation of such an 
effect. 



H. "Condensation" of edges 



In the last of above situations, that is, in the case of 
inhomogeneous fitness G, the form of resulting degree 
distributions crucially depends on the form of the distri- 
bution P(G). For so me s pecial forms of P(G), a striking 
phenomenon occurs ]184| . One or several the "strongest" 
vertices with the largest G may capture a finite fraction of 
all e dges. A related effect was considered in Ref. ]185| ] . In 
Ref. [184] this intriguing effect was called "Bose-Einstein 
condensation" . One can explain t he e ssence of this phe- 
nomenon using a simple example [175]. 

Let us use the model of a growing network with di- 
rected edges introduced in Sec. IX E. To simplify the 
formulas, we set A 



(one can see that this does not 
reduce the generality of the model which produces scal- 
ing exponents in the wide ranges of values, 2 < 7 < 00 
and < f3 < 1) . Let the rule of preference be the same 
as in Sec. [XE, i.e., the probability that an edge is at- 
tached to vertex s is proportional to the in-degree q s of 
the vertex but with one exception — one vertex, s, is 
"stronger" than others. This means that the probabil- 
ity that this vertex attracts an edge is higher. It has an 
additional factor, g > 1, and proportional to gqs- This 
means that G s = 1 + (g — l)(5 s ,s- The equations for the 
average in-degree are 
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dq- s (t) 



dt 



(.9 - + Jo dsq(s,t) 



q(s,t) 



(.9- 1)^(0 + /o dsq(s,t) 



> 9s (* = = 9i : 



, q(t,t)=n. (75) 



In the second of Eqs. (f75|), s 7^ s. Obviously, at long 
times, the total in-degree of the network is /„' dsq(s, t) — 
(m + n)t + 0(l). 

At t > s, two situations are possible. In the first 
one, the in-degree qs{t) of the strongest vertex grows 
slower than t, and, at long times, the denominators 
are equal to (m + n)t, so that we get the exponents, 
(3 = mj (m + n) = fto and 7 = 2 + n/m = 70 = 1 + l/0o, 
were < /3o < 1, 2<7 <oo. Here, we introduce the 
exponents, 70 and /3o, of the network in which all vertices 
have equal "strength" (fitness), g = 1. The first line of 
Eq. @, in this case, looks as 



dq~ s (t) _ gm gq s (t) 



t 



(76) 



Hence, at long times, q s (t) = const(qi)t gm ^ m+n \ and 
we see that the in-degree of the strong vertex does grow 
slower than t only for 



g < g c = 1 + — = 70 - 1 = O 1 > 1 
m 



(77) 



so we obtain the natural threshold value. 

In the other situation, g > g c , at long times, we have 
the only possibility, qs(t) = d t, d is some constant, 
d < m + n, since a more rapid growth of qs(t) is impos- 
sible in principle. This means that, for g > g c , a finite 
fraction of all preferentially dis tribu ted edges is captured 
by the strong vertex (in Ref. 1 184 ] just this situation is 
called the Bose-Einstein condensation). We see that a 
single strong vertex may produce a macroscopic effect. 
In this case, Eq. (|75|) takes the form, 



gm 



dt (g — l)d + m + n t 
dq{s,t) m q(s,t) 



dt 



(g — l)d + m + n t 



(78) 



where in the second of Eqs. (|78[), s ^ s. Note that the 
coefficient in the first equation is always larger than the 
coefficient in the second equation, since g c > 1. From the 
first of Eqs. (|78|), we get the condition 



gm 



{g — l)d + m + n 



= 1 



(79) 



so, for g > g c , the following fraction of all edges in the 
network is captured by the strongest vertex: 



d 



1 .9 - 9c 
9c 9- 1 



(80) 



We have to emphasize that the resulting value of d is in- 
dependent on initial conditions! (Recall that we consider 
the long-time limit.) This "condensation" of edges on 
the "strongest" vertex leads to change of exponents. Us- 
ing the condition Eq. (f79|), we readily get the following 
expressions for them, 







< A) , 7 = 1 + .9 > 7o ■ 



(81) 



The fraction of all edges captured by the strongest ver- 
tex and the (3 and 7 exponents vs. g are shown in Fig. 
|32} Note that the growth of g increases the value of the 7 
exponent. If the World is captured by Bill Gates or some 
czar, the distribution of wealth becomes more fair! One 
should note that the strong vertex does not take edges 
away from other vertices but only intercepts them. The 
closer 70 is to 2, the smaller g is necessary to exceed the 
threshold. Above the threshold, the values of the expo- 
nents are determined only by the factor g. Nevertheless, 
the expression for dj(m + n) contains 70, the exponent 
of the homogeneous network. (Recall that the threshold 
value is <7c = 70 — 1). 
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FIG. 32. "Condensation" of edges. Fraction of all edges, 
d/(m + n), captured by a single strong vertex at long times 
and the scalin g exp onents f3 and 7 vs. relative fitness g of the 
strong vertex [175|. The network contains only one "strong" 
vertex. The condensation occurs above the threshold value 
g c = 1//3q = 70 — 1 > 1. Here /3o and 70 are the corre- 
sponding exponents for the network without a strong vertex. 
d/(m + n)[g — > 00] — > /3o and /3[g — > 00] — > 0. 



fraction of all vertices captured by the component of the 
network consisting of "strong vertices" and the scaling 
exponents of both components as functions of g. One 
sees that the threshold is smeared, and the condensation 
phenomenon is absent. 
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FIG. 33. Schematic plot of the degree distribution of the 
network with o ne v ertex, the fitness of which exceeds the 
threshold value [175 . The peak is due to edges "condensed" 
on the strong vertex. A hump at the cutoff of the continuum 
part of the dist ribu tion is a trace of initial conditions (see Sec. 
[XD|and Ref. pi|). 



For g > g c , in the edge condensation regime, the 
strongest vertex determines the evolution of the network. 
With increasing time, a gap between the in-degree of the 
strongest vertex and the maximal in-degree of all oth- 
ers grows (see Fig. ^). A small peak at the end of the 
continuum part o f the d istribution is a trace of initial con- 
ditions, see Sec. IX D . Note that the network remains 
scale-free even above the threshold, i.e., for g > g c , al- 
though 7 grows with growing g. 

The above-described initial-condition-independcnt 
state is realized only in the limit of large networks. In 
the "condensate phase", relaxation to the final state is 



of a power-law kind [175] 



1rs(t)-dt ^ rig _ gc)/g 
(m + n)t 



(82) 



i.e., the fraction of all edges captured by the strong ver- 
tex relaxes to the final value by a power law. Its ex- 
ponent {g — g c )/g approaches zero at the condensation 
point g — g c . This behavior evokes strong associations 
with critical relaxation. 

The threshold, that is, the "condensation point", can 
be easily smeared in the following way. Let vertices have, 
at random, two values of fitness, 1 and g > 1, the prob- 
ability that a vertex has fitness 1 is 1 — p, and, with 
probability p, a vertex has fitness g. The characteristics 
of such a network are shown in Fig. p3. These are the 
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FIG. 34. Fraction of all edges, d/ (m + n), captured by the 
component of "strong" vertices, at long times and the scaling 
expo nents f3 and 7 vs. relative fitness, g, of "strong" vertices 
[171]. The network contains two kinds of vertices — "weak" 
vertices and "strong" ones. We introduce two sets of expo- 
nents for the two components of the network, /3i and 71 - 
for the component consisting of vertices with the unit fitness 
(contains (1— p)t vertices) and j3 g and 7 9 - for the component 
consisting of vertices with the fitness g (contains pt vertices) . 
Thin lines depict the dependences at fixed values of p. Ar- 
rows show how these curves change when p decreases from 1 
to 0. At p — > 0, we obtain dependences shown in Fig. (a 
single strong vertex). At p — > 1, d/(m + n) — > 1, f3 g — * /3a, 
/3i Pol 9, 7i -> 1 + 9/Po, la -> 7o- 

For the observation of the condensation of edges, spe- 
cial distributions P(G) are needed. If P(G) is continuous, 
it must be of a spe cific form in the region of the largest 
fitness, G max |l84| . The structure of the network for 
the c ontinuous distribution P{G) was discussed in Sec. 
IX G . We have already described the situation when the 
transcendental equation ( |72| ) has a real root c. For some 
distributions, including the considered case of a single 
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strong vertex with g > g c , this is impossible. This indi- 
cates the condensation phenomenon — a finite fraction 
of edges condenses on a single vertex with the largest 
fitness. 

At this point we must make the following remark. All 
the growing networks that we consider in the present sec- 
tion have a general feature - each of their vertices has a 
chance to get a new link. Only one circumstance pre- 
vents their enrichment - seizure of this link by another 
vertex. In such kinetics of distribution of edges, there is 
no finite radius of "interaction" and there are no princi- 
pal obstacles for capture of a great fraction of edges by 
some vertex. 



Bianconi and Barabasi 184 noticed that the form of 
Eq. (72) is similar to the form of the well-known equa- 
tion for the Bose gas. Edges were interpreted as Bose- 
particles. They are distributed among energy levels - 
vertices. The energy of each level was related to the fit- 
ness G of the corresponding vertex. The distribution of 
these levels can be obtained from P{G). One may in- 
dicate a set of the distributions P(G) for which there is 
no solution of Eq. ( f72| ) like in the classical phenomenon 
of Bose condensation. Using the analogy with this phe- 
nomenon, Bianconi and Barabasi demonstrated that, in 
this "phase" , a finite fraction of edges (particles of the 
Bose-gas) condenses on the strongest vertex (the lowest 
energy level ) an d called this process "Bose-Einstein con- 
densation" pi . 

One can use various parameterizations of P(G) but it 
is natural to study its variation mainly near G max . E.g., 
in Ref. Ij&fl , it was shown that P{G) cx (G max - G) e 
produces the condensation starting from some minimal 
value of the exponen t 8 C . 

In fact, in paper [184], the equations describing the 
distribution of edges among vertices of the large network 
with inhomogeneous fitting are mapped to the equations 
for the Bose gas. The price of this mapping is the in- 
troduction of thermodynamic quantities such as temper- 
ature, etc. for the description of the network. Unfor- 
tunately, it is not easy to find an interpretation, e.g., 
for temperature in this situation. It is "something" re- 
lated to the form of P(G). Therefore, here, we prefer 
to consider the "condensation" effect without applying 
such analogies and the introduction of thermodynamic 
variables but directly using the distribution P(G). 



I. Correlations and distribution of edges over 
network 

In the present section, we mainly studied degree dis- 
tributions. We have to admit, however, that they pro- 
vide rather incomplete description of a growing network. 
One can better imagine the network if the average el- 
ements of the adjacency matrix are known. Their val- 
ues are easily calculated in the continuum approxima- 
tion. In the simplest case of the citation graph, the av- 
erage number of edges b(s,s',t) between vertices s and 



s' at time t (s < s' < t) has a very convenient feature, 
b(s,s',t > s') — b(s,s',s'). This crucially simplifies the 
calculations, and the result for scale-free citation graphs 
is 



t, i s m , „ / s\ ~P 
b(s >S ',t) = j(l-P) (- 



0-i 



(83) 



where m is the number of connections of each new ver- 



tex (see Rcf. |7§). Recall that /3 = l/( 7 - 1). This 
chara cteristic was obtained exactly for the model of Sec. 
IXC. One sees that, generally, the product does not fac- 
torize to k(s, t)k(s', t). The only exception is the (3 = 1/2 
(7 = 3) case. From Eq. (jS3"l), in the scaling regime, we 
can estimate the average number of connections between 
ancestor vertices of degree k and descendants with de- 
gree k' . In the continuum approximation, this quantity 
is proportional to the probability P(k, k') that vertices of 
degree k (ancestor) and kl (descendant) are connected: 



P(k,k') oc k-^k'- 2 = k'^-^k'- 



(84) 



The origin of the factor fc^ 7-1 ) on the left-hand side of 
the equation is clear: new vertices are attached to old 
ones with probability ~ kP(k), where P(k) is the de- 
gree distribution. Meanwhile, degrees of nearest neigh- 
bors of a vertex in equilibrium scale-free networks are 
also distributed as kP(k). Indeed, in equilibrium net- 
works with statistically uncorrelated vertices, this degree 
distribution coincides with that for an end vertex (either 
of the two ones) of a randoml y cho sen edge, which is 
proportional to kP(k) (see Sec. XI A ). Then, in equilib- 
rium networks, the probability that a randomly chosen 
edge connects vertices of degrees k and k! is P(k, kl) = 
kP(k)k'P(k')/[J2 k kP(k)} 2 , that differs sharply from Eq. 
(|J). The factor k'~ 2 in Eq. (|^) is, in particular, the 
degree distribution of the nearest neighbors of the oldest 
(the richest) vertex (compare with the degree distribu- 
tion of the nearest neighbors of a new vertex, A; - ^ -1 )). 

The distribution P(k, k') was originally obtained by 
Krapivsky and Redner pal in the framework of the rate 
equation approach [|94|]95yi79|l which is simila r to the 
master equation one, which was discussed in Sec. IX B, so 



we do not present their details here. The main statement 
is that this probability does not factorize. This means 
that degrees of neighboring vertices in growing networks 
are correlated. 

If one keeps fixed the large degree k of an ancestor ver- 
tex, then, the most probable linking is with a descendant 
vertex of the smallest degree kl ~ 1. If the large degree 
kl of a descendant vertex is fixed, the above probability 
has a maximum at some k which is smaller than k' but 
of the order of it. 

This absence of the factorization (the correlations) in- 
dicates a sharp difference of growing networks from equi- 
librium graphs with statistically independent vertices. 
The reason is the obvious absence of time-reversal sym- 
metry — quite natural asymmetry between parents and 
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children. Therefore, these correlations are present even 
for networks growing without preferential linking. 

Reliable measurements of the joint degree distribution 
of neighboring vertices, P(k,k'), are difficult because of 
poor statistics. Nevertheless, one can easily obtain in- 
formation about the correlations in a network measuring 
the dependence of the average degree of nearest neigh- 
bors of a vertex on its degree, k nn (k ) (see discussion of 
the empirical data Pq| in Sec. VC 1). This is an impor- 



tant characteristic of correlations in growing network, so 
let us discuss it briefly. 

For a scale-free citation graph with the degree distri- 
bution exponent 7 < 3, from Eq. (p3[), one may eas- 
ily obtain the dependence k nn {k) oc /c~( 3-7 ) for small 
enough degrees k. For larger k, the dependence is a slow, 
logarithm-like function. 

If, in addition, new connections in a growing scale-free 
network also emerge between old vertices, the form of 
k nn (k) depends on the details of the linking procedure. 
For example, (a) if new connections in the old part of 
the network emerge without any preference, the power- 
law singularity in k nn (k) is retained; (b) if these edges 
connect randomly chosen old vertices with old vertices 
which are chosen preferentially, the power-law singular- 
ity is retained; but (c) if these edges connect pairs pref- 
erentially chosen old vertices, the singularity disappears 
as the number of such connections increases. 

For directed networks, it is easy to introduce the no- 
tions of in- and out-components with respect to any ver- 
tex. One can define the out-component as the set of all 
"ancestors" of the vertex plus itself, i.e., all the vertices 
that can be reached if one starts from this vertex [j95[ . 
The in-component of the vertex contains all the vertices 
from which it can be reached, i.e., all its "descendants" 
plus itself. 

The distribution of the sizes of the in- and out- 
components of citation graphs (which are, in fact, di- 
rected networks) and their other characteristics were cal- 
culated in Ref. f95fl . These results provide information 
about the topology of these networks. For the cita- 
tion networks with t vertices, the distribution of the in- 
component sizes s was found to be proportional to t/s 2 
for s 3> 1. This relation is valid for a wide variety of 
preference functions, including even the absence of any 
preference. For such a form of the in-component size dis- 
tribution, the following condition is necessary: the power 
y in the preference function k v should not exceed 1. 

In Ref. |)|| , the out-components of scale-free citation 
graphs were studied. E.g., for the BA model, that is, for 
the citation graph with 7 = 3, the out-component size 
distribution is ln s_1 (t + \)/[{t + l)(s - 1)!]. Here, s is 
the out-component size. This form is valid for the net- 
work with one edge (and, as usually, one vertex) added 
per unit of time. The distribution has a maximum at 
s — 1 = ln(f + 1) and quickly decays at larger s. Hence, 
the typical size of the out-component is of the order of 
lni, i.e. of the order of the typical s hortes t-path length 
in classical random graphs (see Sec. IIIB). Similar re- 



sults were obtained for all scale- free citation graphs [95] . 
The relation for typical size of the out-component is also 
valid for any citation graph with power y of the prefer- 
ence function k v less or equal 1. In this respect, these 
networks are similar to the classical random graphs. 



J. Accelerated growth of networks 

The linear growth, when the total number of edges 
in the network is a linear function of its size (the total 
number of vertices), is only a particular case of the net- 
work evolution. For in stance, data on the WW W gr owth 
1751 (see Sec. |VC2|), for the Internet §Mp7| (see 



Sec. V C 1 ), for net work s of citations in scientific litera- 
ture] 100 1 (see Sec. V_A), and for collaboration networks 
|^5,10l|(see Sec. VB) demonstrate that the total num- 
bers of edges in these networks grow faster than the total 
numbers of vertices, and one can say that the growth is 
accelerated 1 1 86 ] , that is, nonlinear. 



One can show that a power-law dependence of t he in - 
put flow of links may produce scale- free networks [ 186 |, 
and non-stationary degree distributions m ay em erge. In 
such a case, the scaling relations of Sec 



[XD 



arc eas- 



ily generalized. In the limit of the large network size, in 
general, one can write 



and 



P(k,t) oc t z k' 



k(s,t)cxt S (^ 



(85) 
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General relations in the present subsection are valid for 
degree-, in-degree, and out-degree distributions, so that 
here k denotes not only degree but also in- and out- 
degree. One can show that the exponents z, 5, and (3 
are coupled by the relation, z = 5/ [3, and the old rela- 
tion ([34|) is valid. The distribution for individual vertices 
now is of the form 



p(k,s,t) 



,1/(7-1) 



t (l+z)/( 7 -l) 



s l/(7-l) 
i(l+2)/(7-l) 



(87) 



Also, 



P(k,t) = t z k-^F{kt-^ 1+z ^) = t z k-~<F(kt- [1+z V^-V) , 



and the distribution has a cut-off at k cu t ~ t( 1+z )/(7- 1 ). 
We emphasize that Eqs. ( |87| ) and (|8|) are quite general 
relations obtained from the assumption of a power-law 
dependence of the total number of edges on the total 
number of vertices in the network. 

As demonstrating examples, in Ref. [186|, two mod- 



els for accelerating growth of networks with preferential 
linking were studied. In particular, the in-degree distri- 
butions of directed networks were considered. The input 
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flow of edges is suggested to grow as t a , where a is a new 
exponent, a < 1. The restriction is introduced to avoid 
multiple edges in the network at long times. 



In the first model, the additional attractiveness is con- 
stant. In this case, j3 = 1 + a, 7 = I + l/(l+a), 5 = z = 0, 
and the degree-distribution is stationary (if we ignore its 
time-dependent cutoff), see Fig. a. Therefore, the 
degree distribution of a non-linearly growing scale-free 
network may have the 7 exponent less than 2. When the 
input flow of edges grows proportionally to t, 7 = 3/2. 



Th e sec ond model is also based on the model of 
Sec. IX E but, in this case, the additional attractive- 
ness A (or the number of incoming edges n of new 
nodes) grows with increasing time. For instance, let 



the additional attractiveness be proportional to the av- 
erage in-degree of the network with a constant factor, 
A(t) = Bq(t) = Bcnt a /(1 + a). In this case, the 7 ex- 
ponent exceeds 2 and the distribution is non-stationary: 
7 = 2 + B(l + o)/(l-Bo),j9= (1- Ba)/(1 + B), 6 = a, 
and z = a(l + B)/{\ — Ba) (see Fig. [35|, b). In such an 
event, the scaling regime is realized only if Ba < 1. 




Log 1Q q 




FIG. 35. Schematic log-log plots of degree distributions 
in the two models for acce lerating growth of networks which 
are discussed in Sec. [XJ. The first model produces the sta- 
tionary degree distribution with the exponent 7 < 2 (a) at 
long times. The degree distribution of the second model (b) 
is non-stationary, 7 > 2. The arrows indicate changes of the 
distributions as the networks grow. 

The same results are valid for degree distributions of 
undirected networks which grow nonlinearly. 

In general, assuming a power-law dependence of the 
input flow of edges on the network size (a > 0), it is easy 
to obt ain the f ollowing relations for the exponents 7, z, 
and a [ 175,186). If one assumes that 1 < 7 < 2, then 
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(89) 



so z should be smaller than a. This situation is realized 
in the first model above (see Fig. |35|, a). At long times, 
the degree distribution is stationary. One may show that 
the cutoff k cu t ~ t a+1 , that is, of the order of the total 
degree of the network, so that the cutoff is in fact absent. 
If one assumes 7 > 2, the relation 



7=1 + - 
a 



(90) 
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is valid, so that one has to have z > a. The distribution 
is non-stationary (see Fig. |35|, b), and this is the case for 
the latter model. 

In both above models, the input flow was preset, that 
is, we suggest its power-law time dependence. However, 
such a power-law growth may arise quite naturally. Let 
us consider an illustrating example. An undirected cita- 
tion graph grows according the following rules: 

(i) one new vertex is added to the network in unit time; 

(ii) with a probability 1 — p it connects to a randomly 
chosen vertex or, with complementary probability p, not 
only to this vertex but also to all its nearest neighbors. 

If one is interested only in exponents, the same result 
is valid if a new vertex connects to a randomly chosen 
vertex plus to some of its nearest neighbors, each being 
chosen with the probability p. A new vertex actually 
copies (inherits) a fraction of connections of its ancestor 
(compare with a network growth p rocess which leads to 
multifractal distributions [187 188 1 , see Sec. Such 
copying processes may be realized, for example, in net- 
works of protein-protein interactions (see discussion in 
Ref. @|). 

These growth rules lead to an effective linear preferen- 
tial attachment of edges to vertices with a large number 
of connections. One can easily see that the average de- 
gree (and the input flow of vertices) of this graph grows 
as i 2 ^ 1 when p > 1/2. For p < 1/2, the mean degree 
approaches the constant value 2/(1 — 2p) at long times, 
and the degree distribution is stationary with exponent 
7 = 1 + 1/p > 3. When p > 1/2, the degree distri- 
bution is non-stationary, like in the latter model, and 
7= 1 + 1/(1- p) > 3. 
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A model of a directed growing network, whose mean 
degree lo garit hmically grows with t, was proposed by 
Vazquez 1 170 . The preferential linking arises in this 



model dyna mically. Simulations and heuristic arguments 
in Ref. [17C] have showed that the exponent 7 of this net- 
work is equal or close to 2. 

Above models have input flow of edges growing propor- 
tionally to t a . The situation when it grows as t a + const 
is also very interesting. Of course, at long times, such 
growth yields the above distributions. However, for real 
finite networks, this limit may not be approached. An 
intriguing application of these ideas can be pro pose d for 
the word w eb having been constructed in Ref. |l26j [ (see 
Sec. |VDj). 

Recall that the empirical degree distribution of the 
word web has a complex form (see Fig. ^|) with two dis- 
tinct regions. A high quality of the empirical degree dis- 
tribution due to the large number of vertices in the word 
web, t w 470 000, and its high mean degree k(t) w 72 to- 
gether with the specific complex form gives a real chance 
to explain convincingly the structure of the word web. 




new word 



FIG. 36. Scheme of the word web growth [127]. At each 
time step a new word appears, so t is the total number of 
words. It connects to some preferentially chosen old word. Si- 
multaneously, ct new undirected edges emerge between pairs 
of preferentially chosen old words. All the edges are undi- 
rected. We use the simplest kind of the preferential attach- 
ment when a vertex is chosen with the probability propor- 
tional to the number of its connections. 

Let us cons ider the minimal model for the evolving 
word web [127] (see the discussion of practically the same 
model for networks of collaborations in Ref. We use 

the following rules of the growth of this undirected net- 
work (see Fig. [36]). At each time step, a new vertex 
(word) is added to the network, and the total number of 
vertices, t, plays the role of time. At its birth, the new 
word connects to several old ones. We do not know the 
original number of connections. We only know that it is 
of the order of 1. It would be unfair to play with an un- 
known parameter to fit the experimental data, so we set 
this number to 1 (one can check that the introduction of 
this parameter does not change the degree distribution 
of the word web noticeably). We use the simplest natu- 
ral version of the preferential linking, so a new word is 
connected to some old one /i with the probability pro- 
portional to its degree fc M , like in the Barabasi- Albert 



model 55 1 . In addition, at each increment of time, ct new 
edges emerge between old words, where c is a constant 
coefficient that characterizes a particular network. The 
linear dependence appears if each vertex makes new con- 
nections at a constant rate, and we choose it as the most 
simple and natural. These new edges emerge between old 
words \x and v with the prob ability pr oportiona l to t he 



product of their degrees k^k v |102|Jl76|| (see Sec. |TX F|) . 



The mean degree of the network is equal to k(t) = 
2+ct. According to the preceding analysis, this yields the 
stationary degree distribution with the exponent 7 = 3/2 
at long times. The additional constant, equal to 2, is 
important if we are interested in how the stat iona ry dis- 
tribution is approached. Simple calculations [ 127 1 in the 
framework of the continuum approach yield the following 
degree distribution 



P(M) = 



1 cs(2 + cs) 1 
ct 1 4- 2cs k 



where s = s(k,t) is the solution of the equation 

MM) = (_)"7i±_^ /2 

v ; \cs) V2 + cs 



(91) 



(92) 



This distribution has two distinct regions separated by 
the crossover point k cross « \/ct(2 + ci) 3 / 2 . The crossover 
moves in the direction of large degrees while the net- 
work grows. Below this point, the degree distribution is 
stationary, P(k) = i/c~ 3 / 2 (we use the fact that in the 
word web ct 3> 1). Above the crossover point, we obtain 
P(k,t) = j(ct) 3 k~ 3 , so that the degree distribution is 
non-stationary in this region. 

At first sight, contribution to the average degree of the 
network (or the input flow of new links) from connections 
to new words seems negligible when compared with links 
which emerge between old words (2 <C ct ps 70). Nev- 
ertheless, as we see, this small contribution produces an 
observable effect. 

The pos ition of the cutoff produced by finite-size effect 
(see Sec. |IX D for the relation t dk P(k) ~ 1), is 

k C ut ~ y/t/8(ct) 3 / 2 . In Fig. ^, we plot the degree distri- 
bution of the model (the solid line) . To obtain the the- 
oretical curve, known parameters of the word web were 
used, t and k(t). The deviations from the continuum 
approximation are accounted for in the small k region, 
k < 10. One sees that agreement with the empirical dis- 
tribution is excellent. Positions of theoretical crossover 
and cutoff are also perfect. Note that no fitting was 
made. For a better comparison, in Fig. ^, the theoretical 
curve is displaced upward to exclude two experimental 
points with the smallest k since these points are depen- 
dent on the method of the construction of the word web, 
and any comparison in this region is meaningless in prin- 
ciple. 

Note that few words are in the region above the 
crossover point k cross ps 5 x 10 3 . As language grows, 
k cross increases rapidly but, as it follows from above re- 
lations, the total number of words of degree greater than 
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kcross does not change. It is a constant of the order of 
l/(8c) « t/(8fc) — 10 3 , that is, of the order of the size 
of a small set of words forming the kernel lexicon o f th e 
British English which was estimated as 5 000 words [ 190 1 . 
Thus, the size of the core of language does not vary as 
language evolves. 



K. Decaying networks 



In Sec. IX F we have described a wide spec- 
trum of possibilities to add edges to a network. 
Results for various cases may be found in Refs. 



5|pl94p^ p^ , pl9| , p75|,p79pC 



184,181 



183]. Here, we 



discuss the opposite situation, namely, a fraction of edges 
may disappear during the network growth. This situa- 
tion, an addition al permanen t deletion of edges, is con- 
sidered in Refs. 102,176,175], Here we write down the 
result for a typical cas e of the model of directed grow- 
ing network from Sec. [XF in which we, for brevity, set 
n = 0. Each time a vertex and m edges are added, c 
old randomly chosen connections disappear. If we define 
7i(c = 0) = 7o, the 7, exponent is 



7i 



7o 



1 - 7oc/m + (c/m) 2 



(93) 



The resulting phase diagram, c/m vs. 70 is shown in 
Fig. |3^. One sees that random removal of edges in- 
creases the 7j exponent which grows monotonously with 
increasing c/m until it becomes infinite on the line 70 = 
(c/m) + l/(c/m). In the dashed region of Fig. 37, the 
network is out of the class of scale-free nets. Note that, 
for large enough c/m, the network may decay to a set of 
uncoupled clusters. 




FIG. 37. Phase diagram of the directed network growing 
under condition of permanent random damage - the perma- 
nent deletion of random edges — on axis 70, c/m. At each 
time step, m new edges are added and c random edges are 
deleted (see the text). 70 is the scaling exponent of the corre- 
sponding network growing without deletion of edges. Curves 
in the plot are lines of constant 71 . 7, = 00 on the line 
70 = (c/m) + (c/m)~ . In the dashed region, the network 
is out of the class of scale-free nets. 

The effect of permanent random damage (the removal 
of edges) on undirected grow ing networks, in fact, has 
been considered in Sec. IX F (see the discussion of Eq. 
(I))- 

The permanent random removal of vertices produces 
a different effect on the growing networks with prefer- 
ential linking 175]. In this case, the 7 exponent does 
not change. Nevertheless, the exponent (3 varies. Let 
a randomly chosen vertex be deleted with probability 
c each time a new vertex is added to the network. If, 
again we introduce the notations 7(0 = 0) = 70 and 
(3(c = 0) = Po, then one can show that (3 = (3q/(1 — c) 
and 7 = 1 + l/[/3(l - c)] = 1 + 1/A) = 7o- We see 
that the scaling relation ([54]) is violated in this situa- 
tion. The reason of this violation and of the change of f3 
is an effective re-normalization of the s variable due to 
the removal of vertices. In such an event, scaling forms of 
the degree-distributions for individual vertices and of the 
total degree-distribution axe p(k, s,t) — (s/t) 13 f[k(s/t)P] 
and P(k) = fc-^+VI^i-c)]}^/^). 

One can show that the permanent deletion of a frac- 
tion of vertices with the largest values of degree, that 
is an analogy of intentional damage (attack) |5S |33|,[32| , 
destroys the scaling behavior of the network [fL75 1 . 



L. Eigenvalue spectrum of the adjacency matrix 

The structure of the adjacency matrix and, therefore, 
of the network itself, can be characterized by its eigen- 
value spectrum G(A). The eigenvalue spectra of classi- 
cal random graphs are well studied. For the undirected 
infinite random graph with the Poisson degree distribu- 
tion, the (re-scaled) eigenvalue spectrum has a semi-circle 
shape (here, G(A) = G(-A)) [g9ljfl92l| . If such a graph 
is large but finite, the tail of the distribution decreases 
exponentially with growing A. 

In the recent paper |l93| , the eigenvalue spectrum of 
the BA model, that is, of the growing scale-free ne twor k 
with 7 = 3, was studied numerically (see also Ref. [235]). 
A sharp difference from classical random graphs was ob- 
served. It was found that its shape is very far from a 
semi-circle, and the tail of the spectrum is of a power 
law form (compare with the observations for the Internet 
g, see Sec. [V(T|). 



42 



M. Scale-free trees 



Above we discussed two main construction procedures 
producing scale-free networks: the preferential linking 
mechanism for growing networks and a rather heuristic 
procedure of Molloy and Reed for equilibrium networks. 
In Ref. [p7| , the (grand canonical) statistical ensemble 
of tree-like graphs was constructed, that is, an equilib- 
rium random tree-like graph. In the particular case of 
mean degree k = 2, the procedure |)7J yields scale- free 
trees with a 7 exponent which takes values in the range 
(2, 00). Cutoffs of the degree distributions of these equi- 
librium networks were found to be at the same point 
k rn t ~ AT 1 / ^ 7-1 ) as for growing scale-free nets (see Eq. 
( p2j } in Sec. IX D ). Here N is the network size. 



In sharp contrast to the standard logarithmic depen- 
dence of the average shortest-path length I on the net- 
work size, the constructed trees were found to have a 
quite different geometry with a power-law dependence 
1 oc N x / dH . Here the fractal dimension dn = 2 for 7 > 3 
and d H = (7 - l)/(7 - 2) when 2 < 7 < 3. 



X. NON-SCALE-FREE NETWORKS WITH 
PREFERENTIAL LINKING 



A linear form of the preference function discussed in 
Sec. IX is only a very particular case. It is hard to 



believe that just this case is the most wide-spread in Na- 
ture. Moreover, recent empirical studie s of collaboration 
networks in the scientific literature |l5,101] suggest that 
the preference function may be of a power-law form (al- 
though in Ref. |103|, a linear attachment was observed in 
such networks). 

One can show that not only a linear of the preference 
function produces scale-free networks but all the prefer- 
ence functions that have linear asymptotes in the range 
of large values of degree 94 9q]. Other preference func- 
tions do not provide scale-free networks. The case of a 
power-law preference function was explicitly considered 
in Refs. |94|,|95|]. The continuum ap proa ch arguments for 
this situation can be found in Ref. ]175[ . 

In Refs. |94l]9"5|l , the extension of the BA model to the 
case of the preference function proportional to k y was 
studied. The results are the following. The situations 
with < y < 1 and y > 1 differs sharply one from each 
other. The case < y < 1 is, in fact, describes crossover 
from the BA model (linear preferential linking) to the 
linking without preference (y = 0) that produces expo- 
nential degree-distributions (see Sec. VIIl). The exact 
result for the stationary degree distribution M,p5| is 



P{k) cx < 



k v exp 

&(M 2 -l)/2 
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(94) 

Here, /j. depends on y and varies from 1 when y = 
to 2 for y = 1. Near these points, /i(y) is linear: 
fi(y) - 1 0.5078 y and 2 - fj,(y) = 2.407(1 - y). 

In the case of y > 1, most of connections come to 
the oldest vertex. Furthermore, for y > 2, there is a fi- 
nite probability that it is connected to all other vertices. 
For simplicity, let, at each increment of time, one ver- 
tex with one edge be added. Then, probability V(t) that 
the oldest vertex captures all edges satisfies the relation: 
V{t + l) =V{t)W/[t'V + Then, 



Vit -» 00) 



OO 1 



(95) 



This probability is indeed nonzero when y > 2. 

Applying the master equation approach to this net- 
work one can obtain the following results 0,^5). If 
y > 2, all but a finite number of vertices are connected 
with the oldest vertex. For 1 < y < 2, the oldest ver- 
tex is connected to almost every other vertex but various 
situations are possible for the distribution of edges. For 
(i+l)/i < U < j/U~ 1): the number of vertices of degree 
k > j (this number is equal to tP(k > j)) is finite and 
grows as < t 1 for k < j. This just means that 

P(k = 1, t — > 00) — > 1, so practically all the vertices are 
of unit degree and almost all the connections are with 
the oldest vertex. 

It would be a mistake to presume that linear preferen- 
tial linking always provides scale-free networks networks 
(or, more precisely, networks with power-law degree dis- 
tributions). We have to repeat that the growth producing 
power-law distributions is only a very particular situa- 
tion. In Refs. [187, 188 1, the idea of preferential linking 
f35f was combined with partial inheritance (partial copy- 
ing ) of degree of indivi dual vertices by n ew ones |194 , 195 j . 
(The papers p9^ , ^95| as well as Refs. |l9^0rj|| are de- 
voted to more complex models generically related to the 
biological evolution processes.) 

As an illustrating example, the directed network in 
which the growth is governed by the same rule of pref- 
erential linking as in the BA model, was considered an- 
alytically. At each time step, apart from m new edges 
being distributed preferentially, some additional connec- 
tions emerge. This additional new edges are attached to 
a new vertex. This one is born with a random number 
of incoming edges which is distributed according to some 
distribution function P c (q,t) depending on the state of 
the network at the birth time. In this rather general 
situation, the master equation looks like 
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' dt 

= Pc(q,t), 



P(q,t) + =^[qP(Q,t) - (q-l)P(q - M)] 

q{t) 

(96) 



where q(t) is the average in-degree of the net at time t. 
One sees that t he te rm <5 9j o on the right hand side of Eq. 
( p5| ) from Sec. KB is substituted by P c (qt) m Eq. (96), 
so Eq. d96| ) is the direct generalization of Eq. (|25|). If 
every new vertex is born by some randomly chosen old 
one, and at the moment of birth, it "inherits" (copies) on 
average, a fraction c of its parent's connectivity, the dis- 
tribution P c (t, q) takes a specific form which allows us to 
solve the problem explicitly. More precisely, with prob- 
ability c, each of q incoming edges of a "parent" creates 
an edge attached to its heir, so that this copying is only 
partial. In particular, such type of "inheritance" (copy- 
ing) produces nodes of zero degree which cannot get new 
connections. Hence, it is worth to consider only "active" 
nodes of non-zero degree. 

The resulting network is not scale-free. One may check 
that its in-degree distribution is of a multifractal type. 
That means that moments of the distribution depend on 
network size in the following way: M n (t) ~ i T (") where 
r(n) is a non-linear function of n. Therefore, special 
attention must be paid to temporal evolution of the in- 
degree distribution. 

If, for example, c is a random number distributed ho- 
mogeneously within the interval (0, 1), and the evolution 
of the network starts from the distribution P(q,to 3> 



1) 



J q,qa ' 



then in-dcgrcc distribution Pi(q,t) of the ver- 



tices of non-zero degree is of the form 



Pi(t,q) = dir^ \n{d 2 q) exp[^2 In t ln(i/g 2 )] . (97) 



Here, 1 < q <C q Vi, di = 0.174 . . ., and d 2 = 0.840 . 
For t — > oo, Eq. (Bo) takes the stationary form 



Pxil) 



V2 



q 



ln(d 2 q) . 



(98) 



One may check that, in this case, r(n) is indeed non- 
linear, r(n) = n/2 — n(n + 1), and the distribution is 
multifractal. 

One should note that the nature of the new term in 
Eq. (^) is rather general, and such effects should exist in 
various real networks. Unfortunately, as far as we know, 
no checks for multifractality of real degree distributions 
were made yet. In fact, the quality of the existing experi- 
mental material (see Sec. 0) does not let one to separate 
power-law and multifractal behaviors. It is quite possi- 
ble, that what is often reported as a power-law degree 
distribution is in fact a multifractal one. The situation 
may be similar to the one in the field of the self-organized 
criticality where numerous distributions first perceived as 
pure power-law dependences, now are treated as multi- 
fractal functions. 

Recently, a multifractal degree distribution was ob- 
tained in a model describin g the evolution of protein- 
protein interaction networks [189], 



XI. PERCOLATION ON NETWORKS 



Rigorously speaking, percolation is a phenomenon de- 
termined for structures with well-defined metric struc- 
ture, like, e.g., regular lattices. In case of networks, where 
it is hard to introduce metric coordinates, one can speak 
about the emergence of a giant component. In physi- 
cal literature, a phenomenon related to the emergence of 
a giant component in networks is usually called percola- 
tion, and the phase transition of the emergence of a giant 
component is called a percolation threshold pl] , |8l| , p2[ . 
We follow this tradition. 

If the giant component is absent, the network is only 
a set of small clusters, so that the study of this charac- 
teristic is of primary importance. For regular lattices, to 
observe the percolation phenomenon, one must remove a 
fraction of sites or bonds. In case of networks it is not 
necessary to delete vertices or edges to eliminate their 
giant components. For instance, one can approach the 
percolation threshold changing the degree distribution of 
a network. 

One should note that percolation phenomena in equi- 
librium and evolving (growing) networks are of different 
nature, so hereafter we consider them separately. Fur- 
thermore, the existing percolation theory for equilibrium 
networks ]6l| and its generalizations are valid only for 
specific graphs constructed by the Molloy-Reed proce- 
dure (see Sec. IV). Also, inasmuch as giant components 
are discussed, we stress that networks must be large. 



A. Theory of percolation on undirected equilibrium 
networks 



Very important results on percolation on random net- 
works with arbitrary degree distributions and random 
connections are due to Molloy and Reed |8l||8^|. They 



were subsequently developed in papers |61 62.147,163], 
and the problem was brought to the level of physical 
clarity. Here, we dwell on the latter efficient approach 
for equilibrium networks constructed by the Molloy-Reed 
procedure. 

The generating function (or the Z-transform) appara- 
tus is used extensively in modern graph theory |7(| . The 
^-transform of the degree distribution is defined as 



(99) 



k=0 



where \y\ < 1. Obviously, $(1) = 1. For example, for the 
Poisson distribution (see Eq. (H) in Sec. VI), this yields 
the Z-transform $(y) = exp\ z i\V — where z\ = k is 
the average degree of a vertex, i.e., the average number 
of the nearest neighbors. The inverse Z-transform is 
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P(k) 



1 d k ^(y) 



kl dy k 



y=0 



1 



dy 



y 



k+l 



(100) 



where C is a contour around which does not enclose 
singularities of Moments of the distribution can be 

easily obtained from the Z-transform: 



k n 



y Ty ) *(y) 



(101) 



In particular, the number of the nearest neighbors is 
z\ = k — $'(1) (naturally, zq = 1). 

This technique is especially convenient for the descrip- 
tion of branching processes and trees. Let us outline the 
key points of the calculations. One can study the percola- 
tion by "infecting" a random vertex and considering the 
process of the infection spreading step by step. Let this 
vertex belong to some connected component. At the first 
step, the nearest neighbors will be infected, at the second 
- the second neighbors, etc. until the step at which all 
the connected component will be infected. 

The first thing we should know is the following. Sup- 
pose that we randomly choose an edge in the network. It 
connects two vertices. Each of them may have some ex- 
tra edges being attached. How do edges breed (multiply) 
at ends of the randomly chosen edge? To know this, one 
should calculate the degree distribution for an end vertex 
(either of the two) of the edge. This distribution is equal 
to kP(k)/J2k kP{k). Indeed, the edge is attached to a 
vertex with probability proportional to its degree k, and 
the degree distribution of vertices is P(k). The denomi- 
nator ensures proper normalization. Z-transform of the 
resulting distribution is 



J2 k kP(k)y h _ f{y) 



= y^y . (1 02 ) 



Meanwhile, one sees that the probability that a ran- 
domly chosen edge of such a graph connects vertices of 
degrees k and k' is 



P{k,k') = 



kP(k)k'P(k') 



(103) 



(compare with the corresponding distribution (]8J) for 
growing networks). 

Actually, one needs a slightly different distribution 
than Eq. ( |102| ). We have to know the distribution of 
the number of connections minus one for either of the 
two end vertices of a randomly chosen edge since we do 
not want to account for the original edge itself. This 
probability equals (k + l)P(k + 1)/Y, k ( k + l ) p ( k + !)> 
so one immediately gets the corresponding Z-transform: 



Note that $ x (l) = 1. 



$'(2/) _ 
$'(1) zi 



(104) 



Let us start from a randomly chosen vertex and look 
how the numbers of its second-nearest neighbors are dis- 
tributed. We recall that the network is large and its ver- 
tices are statistically uncorrelated (i.e. connections are 
random), so, in particular, one can neglect connections 
between the nearest neighbors. Moreover, one should 
state that if such a network is infinitely large, then al- 
most each of its connected components has a tree-like 
structure. The results that we discuss are based on this 
key statement. Then, one can see that the Z-transform 
of the number of the second neighbors of a vertex is 



^2P(k)[^(y)} k = $($i(»)). 



(105) 



Indeed, the reason to write P(k) in the sum is obvious: 
it is the probability that the original vertex has k edges. 
To understand the [$i(y)] fe factor, one should know the 
following property of Z-transform. If ^(y) is the Z- 
transform of the distribution of values of some quantity X 
of a system, then [^(j/)] m is the Z-transform of the distri- 
bution of the sum y?., Xi of the values of this quantity 
observed in m independent realizations of the system. 
At this point we use the basic assumption that vertices 
of the net work are statistically un corre lated. From this 
and Eq. (104), the form of Eq. (105) follows immedi- 
ately. Proceeding in this way, one gets the Z-transform 
of the distribution of numbers of third-nearest neighbors, 
< J>( < l>i( < I > i(y))) (we have again used the tree-like structure 
of the ne twor k) , etc . 

Eqs. (102)-( |l05| ) are the basic relations of this ap- 
proach. From Eqs. ( |101| ) and ( |l05| ), one gets the average 
number of second-nearest neighbors of a vertex, 



~2 



= $'(l)$i(l) = = J2 k (k- l)P(k) ■ (106) 



Using the above Z-transform of the dis tribu tion of num- 
ber of m-th-nearest neighbors and Eq. (101), one readily 
obtains 



[^'(l)]" 1 - 1 ^!) 



Zi . 



(107) 



We see that z m is completely determined by z\ and zi. 
If the giant connected component spans almost surely all 
of the network, the typical shortest path between a pair 
of randomly chosen vertices may be estimated from the 

condition, Y^m=o Zm ~ Substituting Eq. ( 107 ) into 



this condition and assuming N 3> z\, Z2, one obtains |61 



ln(JV/zi)+ln[(z 2 -zi)/2!i] 
ln(z 2 M) 



(108) 



This relation improves on the classic al est imate of the 
typical shortest path written in Sec. Ill B. If the frac- 



tion S of the network occupied by the giant connected 
component is less than one, one may try to improve the 
estimation by r eplacing N — > NS in Eq. (108 ) Note 
that Eq. (108) is of a general nature. It contains only 
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the local characteristics of the network, the numbers of 
the first and second nearest ne ighb ors, z\ and Zi. 

One should stress that Eq. (108) is an estimate. Fur- 
thermore, an obvious problem occurs when 7 < 3, when 
the mean number of second neighbors, z%, diverges (see 
Eq. ( |l06[) ) as N — > 00. Indeed, let us try to estimate 
i fro m below in this "dangerous" region just using Eq. 
( |108| ). When 2 < 7 < 3, accounting for the degree- 
distribution cut-off position k cut ~ ivVlT-i) ( se e Sec. 
pCD| ), we get z 2 ~ J kcut k 2 k~^ ~ jV(3-7)/(7-l). Substi- 
tuting this relation into Eq. ( |108| ) gives the finite value 
1 )/( 3 " 7) + 1 = 2/(3 - 7) in the large net- 
work li mit (n ote that another estimate was recently made 
in Ref. |20l[| ). The last equation demonstrates that the 
above approach (the tree ansatz) fails when 7 < 3 but 
may also be considered as a very rough estimate from 
below for I. 

Let us consider the distribution of the sizes of con- 
nected components of the networks. It is convenient 
to introduce the distribution of the sizes of components 
which are reachable if we start from a randomly cho- 
sen edge and move through one of its ends. Let its Z- 
transform be H\(y). In Refs. 61|, following important 
equation was obtained for it, 



H 1 (y)=y$ 1 (H 1 (y)). 



(109) 



The tree-like structure of large networks under consider- 
ation was again used. In this situation, the probability to 
reach some connected component moving in such a way 
is equal to the sum of probabilities (i) that there is only a 
single vertex, i.e., the dead end, (ii) that this vertex has 
one extra edge leading to another component, (iii) that 
it has two extra edges leading to two other components, 
and so on. Accounting for this structure and for the al- 
ready used property of powers of ^-transfo rm, one gets 
Eq. (|l09|) (compare it with the basic Eq. (fo|)). Now 
one can easily write the expression for the Z-transform 
of the distribution of sizes of connected components, that 
is, the components reachable starting from a randomly 
chosen vertex. Practically repeating the derivation of Eq. 
(105), one obtains |3l| 



H(y)=y<Z>(H 1 (y)). 



(110) 



The factor y appears here, since the starting vertex also 
belongs to the con necte d com ponent. 

From Eqs. ( p9| ) and ( |lTo| ), using Eqs. © and jToo| ), 
we can find the distribution that we discuss. It is easy to 
find the average connected component size s. From Eq. 
(|TTo|), it follows that 



s = H'{l) = 1 + *'(1)£T((1). 
H[(l) can be obtained from Eq. ( |109| ), 

fli(i) = i+*i(i)Bi(i), 

so 



(111) 

(112) 



^(1) 



(113) 



From this, one sees that the giant connected component 
exists when 



that is, when 



$"(1) - $'(1) > 0. 



or, equivalently, when 



k (k - 2)P(k) = k 2 - 2k > . 



(114) 



(115) 



(116) 



The average size of connected components turns out to 
be infinite and the giant connected component emerges 
when — 2)P(fc) = 0. Accounting for expression 

( |106| ) for Z2 we see that the giant connected component is 
present when average number of second nearest neighbors 
is greater than the average number of nearest neighbors, 



z 2 > Z\ 



(117) 



This strong result is due to Mo lloy and Reed [ pT J82 
and derived above following Refs. |31,163] (heuristic ar- 
guments leading to Eq. (116) may be found in Ref. pTj|). 
It has several important consequences. In particular, 
from Eq. (116), it follows that the giant connected com- 
ponent is present when J2k=3 k(& — ^)P{k) > P(l) (the 
case P(2) = 1 is special since just in this situation the 
network has no tree-like structure). Isolated vertices do 
not influence the existence of the giant connected com- 
ponent. Then, if P(l) — 0, the giant component exists 
when J2k>3 > 0- We see that dead ends are of 
primary importance for the existence of the giant com- 
ponent. Indeed, only the term with P(l) in Eq. (116) 
prevents the giant connected component. 

If the giant component exists, the resulting relations 
are the same, equations (109) and ( |1 1 0| ) , but now H(y) 
corresponds to the distribution of the sizes of connected 
components of the network excluding the giant compo- 
nent, so H(l) = 1 — W t where W is the relative si ze o f 
the giant connected component. Then, from Eqs. (10E) 
and (11C), one sees pL] that 



1 - W = $(i c 



(118) 



where t c is the smallest real non-negative solution of 



tr 



*l(*c). 



(119) 



Notice that the effect of isolated vertices is trivial. 
They produce the natural addendum P(0) on the right 
hand part of Eq. (118), and this is all. Also, the iso- 
lated vertices do not influence the existence of the gi- 
ant component. Therefore, we can exclude these nodes 
from consideration and set P(0) = 0. Then, from Eqs. 
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(118) and (119), we immediately see that the absence 
of dead ends, i.e., P(l) = 0, is sufficient for W = 1. 
(Note again that it should be P(2) < 1.) Indeed, in 
this case, $i(0) cx $'(0) = 0, so t c = and W = 1. 
If P(l) > 0, usually W < 1. It seems, there exists 
a situation Iplj , in which W — 1, even if P(l) > 0. 
Let the degree distribution be of a power-law form, 
P(fc) cx fc -7 , k > 1 with exponent 7 < 2. Then its 
first moment fe = Z\ = $'(1) = SfeLi fc fc~ 7 diverges. 
This means that $i(y < 1) = 0, and Eq. (119) has the 
solution t c = 0. Therefore, this case provides W = 1. We 
should warn, however, that, formally speaking, the tree 
ansatz, which is the basis for the above conclusions, is 
not applicable in this situation. Notice that degree dis- 
tributions with 7 < 2 lead to the divergence of the first 
moment, so the number of edges in such nets grows faster 
than the number of vertices as the network size increases 



(see Sec. IX J) 



From Eqs. (118) and ( |119| ), the size of the giant con- 
nected component of the classical random graph of Erdos 
and Renyi can be easily found. We have seen that its 
Poisson degree distribution produces $(y) 



Then 1-W 



-z 1 W 



Hence the giant connected compo- 



nent of such a network exists if its average degree exceeds 



From Eqs. ( |109| ) and ( |110| ), one can understand the 
analytical properties of H(y) and -Hi(y) near the per- 
colation threshold. The analytical structure of H(y), in 
turn, determines the asymptotic form of the size distribu- 
tion of connected components. Substituting the inverse 
of r = Pi(y), i.e., y = Pf (r), into Eq. (|109]), we get 



y = r/$i(r) 



(120) 



The derivative of dy{r)/dr is zero at the point of singu- 
larity^ _of Hi (y) , y* . This one is, as one can find from 
Eq. (12C), at the point r* which is determined from the 
equation 



$(r*) - r*$i(r*) = 0. 



(121) 



At the percolation threshold, where the giant connected 
component emerges, $1(1) = 1- Accounting, in addition, 
for the equality $(1) = 1, one sees that, at the percola- 
tion threshold, r* = 1. Then, it follows from Eq. (12C) 



that, in this situation, the singularity of Pi(y) reaches 

y* = 1- 

At the percolation thr esho ld, it is easy to expand y(r) 
about r* = 1 using Eq. (120). This gives y[l + (r — 1)] = 
1 + - $"(l)(r - l) 2 /2. If $"(1) ^ 0, that is not true 
only for very special distributions, one gets r = Pi(y) = 
1 + (1 — y) x l 2 ne ar y* = 1. 

Equation ( |llO| ) shows that this singularity in H\{y) 
coincides with the one in H(y) since 3?i(y) has no sin- 
gularities for y < 1. Then, at the percolation threshold, 
near y* = 1, H(y) looks as 



where G\ and C2 are constants. Knowing the analytical 
structure of P(y) at the percolation threshold, and using 
the properties of the Z-transform (recall that a power- 



law function w 
the form 



as w — > 00, yields the Z-transform of 



C3 + C4 (1 — y) + analytical terms 

near y = 1, where C3 and C4 are constants), one can 
restore the structure of the distribution V s {w) of sizes of 
the connected components in the network near this point. 
It looks like l6l 



V s {w) 



w -3/2 e -w/w' 



(123) 



where w* — 1/ In \y* |, y* is the point of the singularity in 
H(y) closest to the origin that is just the singularity that 
we discussed above. Near the percolation threshold, y* is 
close to 1. The values of w* and y* depend on a partic- 
ular form of the degree-distribution. The exponent 3/2 
is the same for all reasonable degree distributions. This 
value is quite natural. Indeed, at the threshold point the 
average size of connected components diverges, so this 
expo nent cannot be greater than 2. We emphasize that 
Eq. (123) is valid only near the percolation threshold. 

The power-law form V s (w) ~ w~ 3 / 2 of the size dis- 
tribution for connected components at the percolation 
threshold point corresponds to the form V(w) ~ w~ 5 / 2 
for the probability that randomly chosen vertex belongs 
to a finite connected component of size w. The lat- 
ter probability is a basic quantity in percolation theory. 
From this form, using standard arguments of percolation 
theory, one finds that if the giant component is absent, 
the largest connected component has size of the order 
of TV 2 / 3 (near the percolation threshold). Here N is the 
size of the network. In this situation, the size of the sec- 
ond largest connected component is of the order of In TV. 
Also, one can prove that, if the giant connected compo- 
nent exists and 7 > 2, the sizes of all other connected 
components are of the order of In N or less |32j . 

In principle, the outlined theory, allows one to compute 
the main statistical properties of these equilibrium net- 
works. Analytical calculations are possible only for the 
simplest degree distributions but numerics is easily appli- 
cable |il| . The results may be also checked by simulation 
using, e.g., efficient algorithm for percolation problems 

In the same paper |pl| , one can find another general- 
ization of this theory to the case of undirected bipartite 
graphs (see Fig. || in Sec. VB ). The bipartite graphs 



in which connections are present only between vertices of 
different kinds were considered. Such networks, in par- 



H(y) = C 1 +C 2 (l~y) 



(122) 



ticular, describe collaborations (see Sec. VB). The pro- 
posed theory describes percolation on these networks. In 
addition, it allows one to calculate the degree distribu- 
tion of the one-mode projection of the bipartite graph 
from two degree distributions of its vertices of different 
kinds. These relations were checked using data on real 



47 



collaboration graphs, of the Fortune 1000, movie actors, 
collaborations in physics, etc. For example, from the data 
for the network of the members of the boards of direc- 
tors of the Fortune 1000 companies, the distribution of 
the numbers of boards on which a director sits and the 
distribution of the numbers of directors on boards were 
extracted. From these two distributions, the distribution 
of the total numbers of co-directors for each director was 
calculated. The result turned to be very close to the 
corresponding empirical distribution of the co-directors. 



B. Percolation on directed equilibrium networks 

This theory can be generalized to equilibrium dire cted 
networks with stati stically uncorrelated vertices |n , 112 
(see Fig. ^and Sec. |V C 2 , where the definitions of a giant 
strongly connected component (GSCC), a giant weakly 
connected component (GWCC), and giant in- and out- 
components (GIN and GOUT) were given). In this case, 
it is natural to consider the joint in- and out-degree dis- 
tribution P(ki,k Q ) and its Z-transform 



x c = ^\x c ) 



(126) 



They have the following meaning. x c < 1 is the proba- 
bility that the connected component, obtained by mov- 
ing against the edge directions starting from a randomly 
chosen edge, is finite. y c < 1 is the probability that 
the connected component, obtained by moving along the 
edge directions starting from a randomly chosen edge, is 
finite. The expressions for the relative sizes of the GIN, 
I, and GOUT, O, have the form loll] 



I = 1 - $(x c , 1) , = l-$(l,y c ). 



(127) 



Recall that, in our definition, the GSCC is the inter- 
ception of GIN and GOUT. Accounting for the meaning 
of x c and y Cl we can find exactly the r elative size of the 
GSCC (see the derivation in Ref. JTT|]): 



<P(x,y)= P(k,k )x k *y k ° 



(124) 



If all the connections are inside of the network the av- 
erage in- and out-degrees are equal (see Eq. (||) in Sec. 

where z\ is the mean degree of the network. Z-transform 
of the deg ree di strib ution is <fr( w )(x) — $(x,x). Using 
Eqs. (118) and (119), from $W(a:), one gets the relative 
size W of the GWCC. 

The sizes of the GIN and GOUT can be obtained 
in the framework of the following rigorous procedure 
]6l| . One introduces the Z-transform of the out-degree 
distribution of the vertex approachable by following 
a randomly chosen edge when one moves along the 
edge direction, ^\y) = d x ${x,y) \ x=1 / z^ d \ Also, 



«&i (a;) = d y ^(x 1 y) | j /z^ corresponds to the in- 
degree distribution of the vertex which one can ap- 
proach moving against the edge direction. The GIN 
and GOUT are present if $^'(1) = $i o) '(l) = 
dl v <S>{x,y) \ x=ltV=1 /z {d) > 1, i.e. when 



S=J2 P(h,k )(l-x k c >)(l-y k c °) = 

ki,k a 

1 - $(.t c , 1) - $(1, y c ) + §{x c , y c ) . (128) 



Furthermore, knowing W, S, I, and O, it is easy to ob- 
tain the relative size of tendrils, 



T = W + S-I-0. 



(129) 



Equations (|11S| ), flllj ), and (|126|)-(|l29,) allow us to ob- 
tain exactly the relative sizes of all the giant components 
of equilibrium directed networks with arbitrary joint in- 
and out-degree distributions, if their vertices are statis- 
tically uncorrelated. It is useful to rewrite Eq. ( |128| ) in 
the form 



kt,k c 



(2/l2 k D 



k )P(ki : ko) 



2 h{k -\)P{ki,k ) 



2Y,ko(h-l)P{h,k o )>0 (125) 

ki,k a 



— the generalization of Eq. (116). This is also the con- 
dition for the existence of the GSCC. In this case, there 
exist non-trivial solutions of the equations 



S = IO + $(x c , y c ) - $(x c , !)$(!, y c ) 



(130) 



If the joint distribution of in- and out -degrees factorizes, 
P(k l ,k D ) = Pi{ki)P (k ), Eq. ( pfj| ) takes the simple 
form S = IO. Here, Pi(ki) is the in-degree distribution 
of the network, and P (k ) is the out-degree distribu- 
tion. Otherwise, such factorization of S is impossible. At 
the point of the emergence of GIN, GOUT, and GSCC, 
x c = y c = 1, and /, O, and S simultaneously approach 
zero. 
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FIG. 38. Schematic plots of the variations of all the giant 
components vs. some control parameter for the undirected 
equilibrium network (a) and for the directed equilibrium one 
(6). In the undirected graph, the meanings of the giant con- 
nected component (GCC), i.e., its percolating cluster, and 
the GWCC coincide. Near the point of the emergence of the 
GWCC, its size is a linear function of the control parame- 
ter. Near the point of the emergence of the GSCC, GIN, and 
GOUT, the sizes of the giant in- and out-components linearly 
varies with the control parameter, and the size of the giant 
strongly connected component is a quadratic function of the 
deviation of the control parameter from this point. 

Variations of the giant components of equilibrium di- 
rected netw orks with some control parameter were stud- 
ied in Ref. |112|] (see Fig. 38). I and O approach the 



point of the emergence of the GSCC in a linear fashion, 
and S a quadratic function of the deviation of the control 
parameter from its critical value. 

Not pretending to apply this theory for equilib- 
rium networks to the WWW, which is certainly non- 
equilibrium (growing), we recall the relative sizes of the 
giant compo nents o f the WWW. From the data of Ref. 
@ (see Sec. |VC^ ), in the WWW, / w O w 0.490, so 
10 w 0.240, that is, less than that measured in Ref. || 
S w 0.277 but is not far from it. 

An attempt was made to model the WWW using the 
measured in- and out-degree distributions and estimate 
the sizes of these components |Bl[ but the result turned 



out to be far from reality. The main point of the discrep- 
ancy was the following. The reasonable values of param- 
eters of the model network, used in the calculation, (in 
particular, the reasonable fraction of vertices with zero 
out-degree) produce a huge difference between the sizes of 
the giant in- and out-components unlike nea rly eq ual in- 
and out-components of the WWW (see Sec. VC2| ). The 
authors of the paper |6lJ] ascribed this discrepancy to the 
approximation P(ki, k ) ~ Pi(ki)P (k ) which they used 
in their calculations, so the correlations between in- and 
out-degree of vertices discussed in Sec. IX F were not ac- 
counted for. We may add that the equilibrium nets with 
statistically uncorrelated vertices, which we consider in 
this section, are far from the WWW whose growth pro- 
duces strong correlations between its vertices. 



C. Failures and attacks 

The effect of random damage and attack on commu- 
nications networks (WWW and Internet) was simulated 
by Albert, Jeong, and Barabasi f59|| . In their simulations, 
they used: 

(i) the real sample of the WWW containing 325 729 
vertices and 1 498 353 links, 

(ii) the existing map of the Internet containing 6 209 
vertices and 24 401 links, 

(iii) the model for a scale-free network with the 7 ex- 
ponent equal 3 (the BA model), and 

(iv) for comp arison , the exponential growing network 

(see Sec. [vffl]). 
One should again stress that all these growing networks 
differ fro m the equilibrium networks considered in Sees. 

Edges between their vertices are dis- 



XI A and XI B 



tributcd in a different manner because of their growth 
(see Sec. IX 1). Recall that the networks (i) and (ii) have 



the 7 exponents in the range between 2 and 3. 

Failures (random damage) were modeled by the instant 
removal of a fraction of randomly chosen vertices. The 
intentional damage (attack) was described by the instant 
deletion of a fraction of vertices with the highest num- 
bers of connections (degree). The networks were grown 
and then were instantly damaged. In these simulations, 
the networks were treated as undirected. The following 
quantities were measured as functions of the fraction / 
of deleted vertices: 

(i) the average shortest path £ between randomly cho- 
sen vertices of the network, 

(ii) the relative size S of the largest connected compo- 
nent (corresponds to the giant connected component if it 
exists), and 

(iii) the average size s of connected components (ex- 
cluding the giant connected one). 

A striking difference between the scale-free networks 
and the exponential one was observed. Whereas the ex- 
ponential network produces the same dependences ?(/), 
S(f), and s(f) for both kinds of damage, for all scale-free 
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nets which are discussed here, these curves are distinct 
for different types of damage. The qualitative effect of 
the intentional damage was more or less the same for all 
four networks (see Fig. ^). The average shortest path 
rapidly grows with growing /, the size of the giant con- 
nected component turns to be zero at some point f c in- 
dicating the percolation threshold, S(f c ) = 0. Near this 
point, S(f) oc (f c — f) as in the mean-field theory of per- 
colation. At f c ,s diverges. Hence, the behavior is usual 
for the mean-field (or infinitely dimensional) percolation 
and for the percolation in the classical random graphs. 
Nevertheless, one general distinct feature of these scale- 
free nets should be emphasized. The value f c in them is 
anomalously low - several percents, unlike the percola- 
tion threshold of the exponential networks, so that such 
nets are very sensitive to intentional damage. 
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FIG. 39. Schematic plots of the effect of intentional and 
random damage (attack and failures) on the characteristics 
of exponential undirected networks and scale-free undirected 
ones with exponent 7 < 3 |59) . The average shortest path be- 
tween vertices, £, the size of the largest connected component, 
S, and the average size of isolated clusters, s are plotted vs. 
the fraction of removed vertices / = 1 — p. The networks are 
large. The solid lines show the effect of the random damage. 
The effect of the intentional damage is shown by the dashed 
lines. For the exponential networks, both kinds of damage 
produce the same dependences. For the scale-free networks 
with 7 < 3, in the event of the random damage, the percola- 
tion threshold is at the point / — * 1. 



The main observation of Ref. j55J is that the random 
damage has far less pronounced effect on the scale-free 
nets than the intentional one. The variations of the av- 
erage shortest distance with / are hardly visible. The 
size of the giant strongly connected component decreases 
slowly until it disappears in the vicinity of / = 1. s(f) 
grows smoothly with growing / without visible signs of 
singularity. This means that these scale-free networks are 
extremely resilient to random damage. To destroy them 
acting in such away, that is, to eliminate their giant con- 
nected component and to disintegrate them to a set of 
uncoupled clusters, it is necessary to delete practically 
all their vertices! 

Similar observations were made for scale-free networks 
of metabolic reactions fil| , protein networks Q] and 
food webs Q . 

The effect of the attack on the scale-free networks 
seems rather natural since vertices of the highest degree 
determine the structure of these nets, but the vitally im- 
portant resilience against failures needs detailed expla- 
nation. Several recent papers have been devoted to the 
study of this intriguing problem. 



D. Resilience against random breakdowns 



As we saw in Sec. XI C, the random breakdowns (fail- 



ures) of networks more or less correspond to the classical 
site percolation problem, i.e., a vertex of the network is 
present with probability p = 1 — /, and one has to study 
how properties of the network vary with changing p. Now 
we have at hand the controlling parameter p to approach 
the percolation threshold. This parameter can be easily 



inserted into the general relations of Sec. XI A 



The first calculations of the threshold for failures in 
scale- free networks belong to Cohen, Erez, ben-Avraham, 
and Havlin [^o). Here, for logical presentation, we start 
with the considerations based on the approach of Call- 
away, Newman, Strogatz, and Watt s p^ . 

Let us look, how Eqs. (109) and ( |1 10| ) (for undirected 
equilibrium networks) are modified when each vertex of 
a network is present with probability p (site percolation) . 
Let P{k) be the degree distribution for the original (i.e. 
undamaged or virgin ) network with p = 1 . Again, to find 
Hi(y), we have to start from a randomly chosen edge, 
but now we start from a randomly chosen edge of the 
original network with p = 1, so it may be absent when 
p < I . Hence we must account for probability 1 — p that 
the very first vertex (the vertex at the end of the edges) is 
absent. This produces the term (I— p)y° in Hi(y). Then, 
we can pass through this point with the probability p, so 
the equation for Hi (y) takes the form 



Hi(y) = l-p + py$ 1 (H 1 (y)). 



(131) 



Similarly, while calculating H(y), we start from a ran- 
domly chosen vertex of the original network with p = 1 , 
which is absent in the network under consideration with 
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the probability 1 — p. Hence we obtain the additional 
term 1 — p in H (y) and must multiply the remaining con- 
tribution (see Eq. ( |110| )) by p. Therefore, 

(132) 



H(y) = l-p+py*(H 1 (y)) 



This pair of equations |62| actually solves the site per- 
colation problem for networks with random connections. 
In the event of bond percolation, i.e., when an ed ge i s 



131) 



present with the probability p, the form of Eq 
does not change (just follow the justification of Eq 
above). Neverthele ss, in this case, the equation for H(y) 
differs from Eq. ( |l32| ). Indeed, in the bond percola- 
tion problem, all vertices are present, so when we start 
from a randomly chosen vertex, we can repeat the argu- 
ments leading to Eq 
H(y)=y$(H 1 (y)). 

Fo r site percolation, proceedin g in the w ay ou tlined in 
Sec. [KI Aj one gets from Eqs. ( jl3l| ) and ((l35j ) the fol- 
lowing expression for the average size of the connected 
component above the percolation threshold: 



lid) of Sec. XI A, Then, again 



s = H'(y)=p[l 



1-P*i(l) 
For bond percolation, it looks like 



s= 1 



(133) 



(134) 



Therefore, the criterion for the existence of the giant 
connected component now becomes $'(1) > l/p, i.e., 
p<J>"(l) - $'(1) > 0, for both (!) site and bond perco- 
lation. Now, instead of the criterion of Molloy and Reed 
one has @@ 



P 



(135) 



for both site and bond percolation. We again emphasize 
that, here, P(k) is the degree distribution of the virgin 
network with p = 1, and k and k 2 are the average de- 
gree and the second moment for the virgin (undamaged) 
network again. If we take the distribution P(k) of the 
network with removed vertices or edges, we must use the 
original relations of Sec. 



XI A 



Criterion (135) may be rewritten in the form: 



p z 2 > Z\ 



(136) 



where Z\ and z 2 are the average numbers of the first and 
second nearest neighbors in the virgin undamaged net- 
work, respectively. Compare Eq. (136) with Eq. (117). 



Hence the percolation thresholds for both site and bond 
problems are at the same point, 



1 



Pc 



Z2 



(k 2 /k) - 1 
Notice the beauty of this simple formula! 



(137) 



Pr oceeding in a similar way to the derivation in Sec. 



XI A , one obtains the relations for the calculation of the 
relative size S of the giant connected component. In the 
event of the site percolation problem, 



1-S = H(1) = l-p + p$(f), 



(138) 



where t* is the smallest real non-negative solution of the 
equation 



t* = l-p + p$i(t*). 



(139) 



For t he bond-percolation problem, on e sho uld apply Eq. 
(p|) and t* = $i(t*) instead of Eq. (g^gh. 



These relations were used in Ref. |32j to study the ef- 
fect of random damage on networks with different degree 
distributions. The results of the numerica l calc ulations 
support the observations discussed in Sec. XI Cj , so that 
it is really hard to eliminate the giant connected compo- 
nent by this means. 



Basic relations (135) and the first of (137) were de- 
rived in Ref. ]60[ | in a different but instructive way. Let 
us outline it briefly. One starts the derivation applying 
the original criterion (116) of Molloy and Reed directly 
to the randomly damaged network. Then, it contains the 
degree distribution P(k) of the damaged network. This 
degree distribution may be expressed in terms of the orig- 
inal distribution in the following way | fH)| : 



k'=k 



(140) 



Note that this equation is valid both for the deletion of 
vertices and deletion of edges. One may check that the 
first and second moments of the degree distribution for 
the damaged network, P(k), are related to the moments 
of the degree distribution of the virgin network: 



k = pk , 

k 2 = p 2 k 2 + p(l — p)k . 



(141) 



Substituting these relations into the criterion of Molloy 
and Reed (116), one immediately gets Eqs. (135) and 

d% 
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FIG. 40. Schematic plot of dependences of the relative size 
of the largest connected component in the randomly damaged 
finite size net vs. the fraction of removed vertices / = 1 — p 
[ po[ . Distinct curves correspond to different network sizes N 
and two values of the 7 exponent, 2 < 7 < 3 and 7 > 3. 
Arrows show displacement of the curves with increasing N. 
Dashed lines depict the limits N — > 00. Notice the strong size 
effects. 



In Rcf. IpOf , for networks with power-law distributions, 
the variation of the size of the giant connected compo- 
nent with p was studied through simulation (see Fig. 40). 
Strong size effects was observed. For 7 > 3, the percola- 
tion threshold is visible at the point p c = 1 — f c > 0. For 
7 < 3, if a network is infinite, one sees that p c approaches 
zero, so that in this situation, one has to remove (at ran- 
dom) practically all vertices of the network to eliminate 
the giant connected component! 

This very importa nt o bservat ion can be understood 
if one looks at Eqs. ( |135| ) and (137). Indeed, from Eq. 
(137), it follows that the percolation threshold p c is zero if 



the second moment of the degree distribution of the virgin 
(undamaged) network diverges. This occurs in networks 
with power-law degree distributions when 7 < 3, but this 
is only one particular possibility. For example, the second 
moment diverges in netwo rks with copying (inheritance) 
of connections of vertices [187-189] (see Sec. |x|), so that 
these network are also super-resilient to failures. 

The condition 7 < 3 for the resilience of scale-free net- 
works to random damage and failures makes the values 
of the exponents of the degree distributions of communi- 
cations networks quite natural. All of them are less than 
3 (see Sec. VC). Note that many other networks, e.g., 
biological ones, must be necessarily resilient to failures. 
Therefore, this condition is of great importance. 

One should note that this result is valid for infinite net- 
works. As one can see from Fig. for 7 < 3, the size 
effects are very strong, and the curves slowly approach 
the infinite network limit where p c {N — > 00) = 0. Here, 
N is the size of the network. Let us estimate this size 



effect for 2 < 7 < 3 by introducing the size-dependent 
threshold p c (N), whose mean ing is clear from Fig. |4C| . 
When N > 1, from Eq. fll37p we obtain 



Pc (n) = £i - A 

' Z 2 fc2 



rN 
Jko 



dkkk-1 



(142) 



Notice, that, for 2 < 7 < 3, the average number of 
second-nearest neighbors is z 2 — k 2 , since the second 
moment diverges as N — ► 00, see Eq. ( |106| ). The nature 
of the upper cut-off of the power-law degre e distr ibution, 
kcut/ko ~ iV 1 ^ 7 ^ 1 ', was explained in Sec. |lX D . ko > 
can be estimated as the minimal value of the degree in 
the network. One may expect that fco ~ 1- 

If 2 < 7 < 3, from Eq. flT|g), it readily follows that 



p c (A^) = C(fc ,7)^ (3 " 7)/(7 " 1) . 



(143) 



Here, C(k ,j) does not depend on A^ and is of the or- 
der of 1. C(ko,j) actually depends on the particular 
form of the degree distribution for small values of degree 
and is not of great interest here. When 7 is close to 3, 
p c = can be approached only for a huge network. Even 
if 7 = 2.5 and a net is very large, we get noticeable val- 
ues of the threshold p c , e.g., p c {N = 10 6 ) ~ 10 -2 and 
p c (N = 10 9 ) - 1CT 3 . 

This finite size effec t is most pronounced when 7 = 3. 
In this case, Eq. (|l42j) gives 



Pc(N) 



k In AT 



(144) 



For example, if k = 3, p c (N = 10 4 ) « 0.07, p c {N = 
10 6 ) « 0.05, and p c (N = 10 9 ) » 0.03. 

These estimates demonstrate that in reality, that is, 
for finite scale-free networks, the percolation threshold is 
actually present even if 2 < 7 < 3 (see Fig. |IJ). Only if 
7 < 2, the threshold p c (N) is of the order of 1/N (that 
is, the value of the natura l sca le for p) and is not observ- 
able. From the estimate (143) one sees, that if 7 > 2, it 
should be close enough to 2 for the extreme resilience of 
finite scale-free networks to failures. 

One may find the discussion of the resilience of direc ted 
equilibrium networks to random damage in Ref. [112]. 

Strictly speaking, all the results of this section were 
obtained for equilibrium networks. We do not know any 
analytical answers for the problem of instant damage in 
growing networks. However, it seems that the results 
for equilibrium networks describe the observations [j59f of 
instant damage in growing networks quite well. This is 
not the case for permanent damage in growing networks 
[203|,pCJ4[ (see Sec. pOG|). 



E. Intentional damage 

Th e inte ntional damage (attack), as we have seen in 
Sec. XI C, can be defined as the instant removal of a 
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fraction of vertices with the lar 



jest degrees. The theory -pr 



outlined in Sees. XI A and XI D can be easily generalized 
to the situation when the occupation probability depends 
on the degree of a vertex of the undamaged network, so 
that p = p{k) |32]. Then, for the intentional damage, 
p(k) = 1 if k < k cut and p(k) = if k > k cut . k cut de- 
pends on the value of the fraction / of vertices which are 
deleted, so that k cut = k cut {f). 

In Ref . Q , this approach was used for the computa- 
tion of the dependence of the size of the giant connected 
component on /. The networks with power-law degree 
distributions without isolated vertices (i.e. -P(O) = 0) 
were considered. The calculations were performed for the 
power-law degree distributions with 7 = 2.4, 2.7, and 3.0. 
It was shown that, in accordance with the observations 
in Ref. |59| , the deletion, in such a way, of a rather small 
fraction of vertices eliminates the giant connected compo- 
nent. The corresponding threshold values of / are really 
small, / c (7 = 2.4) = 2.3 x 1CT 2 , / c (2.7) = 1.0 x 10~ 2 , 
and / c (3.0) = 0.2 x 10~ 2 . In this respect, the networks 
are very sensitive to these damage. 

On the other hand, the corresponding values of k cu t, 
at which the transition takes place, were calculated to 
be k cut = 9, 10, and 14 for 7 = 2.4, 2.7, and 3.0, re- 
spectively. A simple estimate j^,^P{k) — f yields 

kcut(f) ~ /-Vfr-l) anc i provides the values k cut {f c {l)) 
close to the above ones. This means that for elimination 
of the giant connected component in this situation, one 
has to delete even vertices of rather low degree. In this 
regard, one must produce a really tremendous destruc- 
tion to disintegrate such networks. Simil ar obs ervations 
were made for the WWW |] (see Sec. |yC2[) . It was 
found that even the deletion of all vertices of in- degree 
larger than 2 does not destroy the giant weakly connected 
component of the WWW. 

Let us show how the threshold f c depends on 7. In Ref. 
p3[ the dependence f c {l) was obtained in the framework 
of the continuum approach. Here we show how one can 
get the exact results [ 205 1 . 

From the relations of Ref. |6^] obtained for the situa- 
tion when some vertices are deleted, and the occupation 
probability p(k) depends on vertex degree, it is easy to 
derive the following condition for the percolation thresh- 
old 



J2k(k-l)P(k)p(k) = J2kP(k) 



(145) 



fc=0 



fc=0 



(compare with Eq. (135)). Here, P(k) is the degree 
distribution of the undamaged network. The intentional 
damage cuts off vertices with k > fc cut (/), where k cut (f) 
can be obtained from 



/ = 1 



k c %tt 



(146) 



fc=0 



Then, the condition (145) takes the following form: 



/c 2 -2fc = £fc(fc-2)P(fc) = £ k(k-l)P{k). 

k=0 fc=feo«t(/)+l 

(147) 

Here, k 2 and k are the moments of the degree distribu- 
tion of the undamaged network. Equation (147) may be 
rewritten in the form 1 205 ] 



t(f) 



£ k(k - l)P(fc) = £ kP ( k ) 



(148) 



k=0 



k=0 



This is the generalization of the Molloy-Reed criterion to 
the case of intentional damage. 



Let us derive exact Eqs. (147) and (14S) in another 
way using the instructive ideas of Ref. |63[| but avoiding 
the continuum approximation. After the deletion of the 
most connected vertices, all the edges attached to the 
deleted vertices must also be removed. Connections in 
the network are random, so the probability / that an 
edge is attached to one of the deleted vertices equals the 
ratio of the total number of edges of deleted vertices to 
the total degree of the network: 



/(/) 



£ P(k) = l- £ P(k). (149) 



k=k cut (f)+l 



k=0 



Now we can recall that Eq. ( |137| ) for the percolation 
threshold is also valid for bond percolation. Therefore, it 
is possible to substitute p c = 1— f c into it. In fact, at first, 
the vertices of the highest degree were removed (the first 
step), and only afterwards their conn ectio ns were deleted 
(the second step). In this event, Eq. (137) describes only 
the effect o f rem oval of edges. Then, it seems natural to 
use in Eq. (137) the degree distribution with the cut-off 
kcut(f) arising after the first step. Accounting for this, 
one gets the relation 



k cut 



(1 - ; c ) £ ep{k) = (2 - f c ) YkP(k) 



(150) 



fc=0 



from which Eq. (147) follows immediately. This demon- 
strates the equivalence of the approaches of Refs. |32| and 

mi- 

In the particular case of the power-law degree distri- 
bution, P(0) = and P(k > 1) = k-^/Ch), where 
C(t) = SfcLi k~ 7 is the zeta-function, Eq. (146) takes 
the form 



Tti k~ 

C(7) 



/ = 1 



(151) 



and condition for the percolation threshold (147) looks 
like 



k mt 



£ fc2 - 7 = c(7-i) + £ fcl ~ 



(152) 



k=l 



k=l 
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From Eqs. ( |l5T| ) and (£5|) 

ft* cut 

(7) and / c (t) (see Fig. 



one can easily obtain 
Note that f c > 
When 7 < 2, a 



[205|). 

only in the range 2 < 7 < 3.479 . . . 
finite number of vertices keep a finite fraction of all con- 
nections, so their removal should have a striking effect 
on the network. For 7 > 3.479 . . ., the giant connected 
component is absent even before the attack. Indeed, in 
the undamaged network, the giant connected component 
exists if ££-i(fc 2 - 2fc)fc~ 7 = C(7 - 2) - 2((7 - 1) > 
[see Eq. (|116|) 1. This corresponds to 7 < 3.479 . . .. 



rithm of their size (see Sec. [II B). The same statement is 
valid for vertices in their giant connected components. In 
Ref. Q, the average shortest-path length between two 
vertices of the giant connected component was studied 
near the percolation threshold. In such a situation, like 
in ordinary infinite-dimensional percolation, the average 
shortest-path length was found to be proportional to the 
square root of the total number of vertices in the giant 
connected component. 



F. Disease spread within networks 
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FIG. 41. Dependence of the percolation threshold 
f c = 1 — p c on the value of the 7 exponent o f the large 
scale-free network for intentional damage (attack) [ 205 1 . Such 
a dependence was or igin ally obtained in the framework of a 
continuum approach pSfl. Here we present the exact curve. / 
is the fraction of removed vertices with the largest numbers 
of connections. The degree distribution of the network before 
the attack is P(Jfe) oc AT 7 for k > 1, P(0) = 0. The circle 
indicates the point 7 = 3.479 . . . above which f c = 0. The 
squares represent the results of calculations and simulation in 
Ref. g. 

The dependence f c {l) nas the maximum, f™ ax = 
0.030 . . .. This is a really small value, so the network is 
indeed weak against the attack. One should emphasize 
that this result is very sensitive to the particular form 
of P[k) in the range of small k and to the number of 
dead ends in the network. In particular, the range of the 
values of 7, where the giant connected component exists, 
crucially changes when the minimal degree increases. 

In Ref. |63] ], one more interesting observation was 
made. The average shortest-path length between two 
vertices in random networks is of the order of the loga- 



The dynamics of disease spread in undirected networks 
with exponentia l and pow er-law degree dist ributions was 
studied in Refs. p06| , p07|| and then in Refs. |2C^-pTo[| (see 
popular discussion of these problems in Ref. [211 1). This 
process is generically related to the percolation properties 
of the networks. 

For the modeling of the spread of diseases within 
networks, two standard models were used. In the 
susceptible-infected-susceptible (SIS) model, 

(i) each healthy (susceptible) vertex is infected with 
rate v when it has at least one infected neighbor, and 

(ii) infected vertices are cured (become susceptible) 
with rate 6. 

Hence, the main parameter of the model is effective 
spreading rate A = v/8. 

The susceptible-infected-removed (SIR) model with 
three states of vertices (susceptible (healthy), infected, 
and removed (dead or immunized)) is slightly more com- 
plex. Nevertheless, the main parameter of interest de- 
scribing the spread of the disease is the same, t hat i s, A, 
and main results for the SIR model on networks BOSI] are 



similar to those for the SIS model [206, 207, 209 1, so here 
we discuss the simpler case. 

Thus, we speak about the SIS model on equilibrium 
undirected networks. In non-scale- free networks (expo- 
nential, Poissonian, the WS network, and others) the sit- 
uation is very similar to the one for the disease spreading 
in ordinary homogeneous systems: there exists a nonzero 
epidemic threshold A c ~ fc, where fc is the average degree 
of a network, below which the disease dies out exponen- 
tially fast. This means that, after a random vertex be- 
comes infected, the average density of infected vertices 
(prevalence) p{t) rapidly approaches zero. 

When A > A c , the infection spreads and becomes en- 
demic: pit — > 00) = p cx (A — A c ). Notice that the 
dependence is linear in the vicinity of the threshold. 

For equilibrium undirected networks with arbitrary de- 
gree distribution P(k), the epidemic threshold is at the 
point 



A c = 



k_ 
P 



(153) 



The relation was obtained in the fram ework of the dy- 
namical mean-field approach [206,207]. Note that, if the 
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ratio kjk 2 is small , the form of Eq. (152) naturally co- 
incides with Eq. (137) for the percolation threshold of 
randomly damaged netw ork, and A c = p c . Then the 
statements of Sees. XI C and XI D about the absence of 



the percolation threshold in infinite scale-free networks 
with 7 < 3 can be repeated for the epidemic threshold: 
in infinite networks, if 7 < 3, diseases spread and become 
endemic for any A > 0. In general, in infinite networks, if 
k 2 diverges, the epidemic threshold is absent, so that this 
result is applicable for a wide class of networks, e.g., see 
models of networks in Refs. Jl87|-|l89[| (Sec. |x|). Thus, 
although the infinite scale-free networks with 7 < 3 are 
extremely robust against random damage, they are in- 
credibly sensitive to the spread of infections. Both these, 
at first sight, contrasting phenomena have the same ori- 
gin, the fat tail of a degree distribution. 

One may notice that most of the observed scale-free 
networks in Nature (see Sec. Fig. |24|, and Table |) 
have 7 exponent between 2 and 3. Then, why are we still 
alive? If the claim about the absence of the epidemic 
threshold in this networks is perfect, pandemics would 
never stop. 

We emphasize that the strong statement that the 
threshold is absent can be applied only for infinite net- 
works. In finite networks, the e pidem ic th resho ld is 
actually present. The estimates ( 143 ) and ( 144 ) also 
yield the dependence of the effective epidemic threshold 
A C (7V) = p c (N ) on the network size. As we demonstrated 
in Sec. |XID , A C (7V) in real networks with 7 < 3 may be 
large enough. Only if 7 is close to 2 from above or smaller 
than 2, the threshold is unobservable. Notice that many 
real networks have the 7 exponents in this range. (An- 
other quite plausible answer to our question is that the 7 
exponent value of the web of human sexual contacts may 
be greater than 3, see Ref. |132|.) 

In an infinite scale- free network with 7 = 3, the preva- 
lence in the endemic state is 



p ~ exp(— C/A) 



(see Refs. [206 207| ) where C is a constant. 
When 7 > 3 but is close enough to 7 = 3, 



3) 



(154) 



(155) 



Finally, according to continuum dynamic mean-field cal- 
culations in Refs. [ 206| , 207[ , when 7 > 4, p ~ (A — A c ) 
(the degree distribution P{k) cx k W(k — m) was used). 
Note, however, that if m = 1, for large values of 7, the 
giant connected component is absent, the network is a 
set of disconnected clusters, and the disease spread is 
impossible. Here we again repeat that the results of this 
section was obtained for equilibrium random networks 
with statistically uncorrelated vertices. 

How can we stop pandemics in widespread scale-free 
networks with 7 < 3? In Refs. [ 208 , 210 1 , the effects of im- 
munization for these networks were studied. The results 



nets and st rong effect of an intentional attack (see Sees. 
XI C , XI D , and XI E ). Analogously, random immuniza- 
tion cannot restore the epidemic threshold, but targeted 
immunization programs for highly connected vertices are 
the most effective way to stop an epidemic. 



G. Anomalous percolation on growing networks 



We have shown (see Sees. |XI A] and Fig. |38j) that per- 
colation on equilibrium networks displays many features 
of percolation on infinite-dimension lattices, i.e. of stan- 
dard mean-field or effective medium percolation. One 
might expect that such a "mean-field" behavior is nat- 
ural for all random networks which have no real metric 
structure. Furthermore, it seems, the abrupt removal of a 
large fraction of randomly chosen vertices or edges from 
the growing network presumably leads to ra ther stan- 
dard percolation phenomena, see Sec. XI C , although 
this issue is still not clear. However, as it was found in 



Ref. [203], this is not the case for networks growing under 
permanent damage which show quite unusual percolation 
phenomenon. 

Here we discuss the process of the emergence of a gi- 
ant connected component in a growing network under the 
variation of growth conditions. In Ref. [203], the simplest 
model of growing network, in which such a percolation 
phenomenon is present, was studied (see Sec. IV): 

(i) At each time step, a new vertex is added to the 
network. 

(ii) Simultaneously, b new undirected edges are created 
between b pairs of randomly chosen vertices (b may be 
non-integer) . 

The degree distribution of this simple network is expo- 
nential. The matter of interest is the probability V{s,t) 
that a randomly chosen vertex belongs to a connected 
component with s vertices at time t. The master equa- 
tion for this probability has the form [204] 



dV(s,t) m , . 



dt 



s-l 



S s ,i + 6s V ( u > ^ V ^ s ~ u ^)- 2 bsV(s, t) . (156) 



This is a basic master equation for the evolution of 
conn ected components in growing networks. Note that 
Eq. (156) is nonlinear unlike previously discussed master 
equa tions for degree distri butions (see Eq. ( |12|) in Sec. 

(|l|) in Sec. |IXA|, <M) in Sec. [IX B|, Ml) in Sec. 



Villi Eq^ (|21|) in Sec. |IXA| , (|25j) in Sec. gXBj , ^ 
ixq , © in Sec. [|). The first term on the right-hand 
side of Eq. (15£), i.e., the Kronecker symbol, accounts 



obtained [208, 210 1 may be easily understood if we recall 
the weak effect of random damage to the integrity of such 



for the addition of new vertices (single clusters) to the 
network, the second (gain) term is the contribution from 
the fusion of pairs of connected components into larger 
ones, the last (loss) term describes the disappearance of 
connected components due to the fusion processes. 
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Equation (156) has a stationary solution V(s) in the 
long time limit. From the distribution V(s), the rela- 
tive size of the giant connected component also follows: 
W = 1 — X)^Li^'( s )- The following surprising re sults 
were found numerically and by simulation in Ref. |203| 
and then in the framework of exact analysis in Ref. [P04| . 

It was shown that the phase transition of the emer- 
gence of giant connected component (percolation tran- 
sition) in the growing network cannot be described by 
an effective medium theory. This phase transition is of 
infinite order, and, near the percolation threshold, the 
relative size of the giant component behaves as 



0.005 



W(b) = 0.590... exp 



where the constant 0.590 . 



2V2 Vb^bc 



(157) 



may be calculated up to any 
desirable precision [204]. Here the rate b of the emer- 
gence of new edges plays the role of a control parameter. 
When b is small, the giant component is absent, and, ob- 
viously, when b is large, the giant component must be 
present. The phase transition occurs at b c = 1/8. All 
the derivatives of W(b) are zero at this point in sharp 
contrast to a linear W{b) dependence in the standard 
mean field theory of percolation. This indicates that this 
phase transition is of infinite order li ke the B crczinskii- 
Kosterlitz-Thouless phase transition [212 213 1. 



When b < 1/8, i.e. in the phase without the giant con- 
nected component, the average size of a finite connected 
component is 



1 - VT^8b 
4& 



(158) 



When the giant component is present, i.e. when b > 1/ 



26(1- W) 



(159) 



This means that the average size of a finite connected 
component jumps discontinuously at the percolation 
threshold from 2 at b = 1/8 — (the phase without the 
giant component) to 4 at b = 1/8 + ( the phase, in which 
the giant component is present) |203j | . This behavior is 
in contrast to the divergence of s at the threshold point 
for standard percolation. (In the latter case, the diver- 
gence takes place, either one approaches the threshold 
from above or below.) We emphasize that the anoma- 
lous percolation transition is not accompanied by any 
anomaly of the degree distribution. 
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FIG. 42. Anomalous percolation in the growing network, 
(a) The giant connected component size W vs. the rate b 
of th e creation of new edges in the growing network (see Eq. 
(157)). (b) The aver age size s of a fi nite c onnected compo- 
nent vs. 6 (see Eqs. ( |l5i| ) and ( pj^ ) ^0^ . The region of 
anomalous percolation threshold b c = 1/8 is shown. 



The probability V(s) that a randomly chosen vertex 
belongs to a finite c onne cted component of size s shows 
surprising behavior [204]. Recall that in standard perco- 
lation and in percolation in equilibrium networks V(s) is 
of a power-law form at the percolation threshold and it 
decreases exponentially both below and above the per- 
colation threshold. More precisely, near the standard 
percolation threshold, V(s) is a power law with an ex- 
ponential cutoff. In the growing network the situation is 
quite different: 



(i) V(s) oc [sins] 2 at the percolation threshold, 

(ii) is a power-law function with an exponential cutoff 
at s c oc 1/W in the phase with the giant component and, 

(iii) in contrast to standard percolation, V{s) has a 
power-law tail in the entire phase without the giant com- 
ponent. 



Furthermore, in Ref. 204 this model was generalized, 
and the percolation in growing scale-free undirected net- 
works was studied (preferential linking mechanism was 
applied). The results are similar to those described 
above. However, in this case, the percolation threshold 
b c and factors in Eq. (157) depend on the value of the 
exponent of the degree distribution. 

This anomalous behavior may be interpreted in the 
following way. New edges are being attached to large 
connected components with higher probability, and large 
connected components have a better chance to merge and 
grow. This produces the preferential growth of large con- 
nected components even in networks where new edges are 
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attached to randomly chosen vertices, that is, without 
any preference. Such a mechanism of effective preferen- 
tial attachment of new vertices to large connected compo- 
nents naturally produces power-law distributions of the 
sizes of connected components and power-law probabili- 
ties V(s). This "self-organized critical state" is realized 
in the growing network only if the giant component is 
absent. 

As soon as the giant connected component emerges, 
the situation changes radically. A new channel of the 
evolution of connected components is coming into play, 
and, with high probability, large connected components 
do not grow up to even larger ones but join to the gi- 
ant component. Therefore, there are few large connected 
components if the giant component is present, and then 
■p(s) is exponential. 

Thus, in the growing networks, two phases are in con- 
tact at the point of the emergence of the giant con- 
nected component — the critical phase without the giant 
component and the normal phase with the giant compo- 
nent. This contact provides the above described effects. 
There exists another example of a contact of a "critical 
phase" (the line of critical points) with a normal phase, 
nam ely, the B erezinskii-Kosterlitz-Thouless phase transi- 
tion [ [212 , [213 1 . Interestingly, critical dependences in both 
these cases have similar functional forms. 

We finish this section with the following remark. Other 
percolation problems for random networks can be con- 
sidered. For instance, in the recent paper [214|, "core" 
percolation was introduced. Dead-end vertices and their 
nearest neighbors are removed successively up to the 
point when no dead ends remain. The remaining giant 
component (if it exists) is called "core" . For the classical 
random graph (Erdos-Renyi network), it was found that 
the core is present when the average degree k is above 
e = 2.718 . . .. For comparison, in the same network, the 
ordinary giant co nnect ed component exists if k > 1 |74|] 
(see Sees. [Vl] and |XI A|). 



XII. GROWTH OF NETWORKS AND 
SELF-ORGANIZED CRITIC ALITY 

We have demonstrated above that the growing net- 
works often self-organize into scale-free structures. The 
change of parameters controlling their growth removes 
them from the class of scale-free structures. This is typ- 
ical for the general self-organized criticality phenomena 
HI (70|, so the considered processes can be linked with 
many other problems. In the present section, we discuss 
briefly this linking. 



A. Linking with sand-pile problems 

As long as only a degree distribution, that is, the dis- 
tribution of a one-vertex characteristic, is studied, the 



models of networks growing with preferential li nkin g can 
be reduced to the following general problem [|l 60f| (see 
Fig. ^0|). At each increment of time, m new particles 
are distributed between the increasing number (by one 
per time step) of boxes according to some rule. Here, the 
boxes play the role of vertices. The particles are associ- 
ated with edges. The probability that a new particle gets 
to a particular box depends on the filling of this box and 
on the filling n umb ers of all other boxes. In fact, what we 
made in Sees. VIII , IX , and[x|, was mainly consideration 
of various versions of this classical model. 

One can enumerate the boxes by age, so that such a 
system has boundaries, the "oldest" box and the new one, 
and, naturally, the distributions of particles in different 
boxes are different. In fact, the resulting enumerated set 
of boxes looks like a sand pile with a front (boxes being 
added) moving with unit rate. The height of the sand 
pile increases as the box age grows. 

Obvious relations to some other classical problems are 
also possible. For instance, the arising master equations 
can be generically related to those for fragmentation phe- 
nomena. One should also ment ion t he F lory-Stockmayer 
theory of the polymer growth [215-217]. 



B. Preferential linking and the Simon model 



Reasons for power-law distributions occurring in vari- 



ous systems, including systems mentioned in Sec. XII A 



were a matter of interest of numerous empirical and the- 
oretical studies starting from 1897 [218- 22fj( . An impor- 
tant advance was achieved by HA. Simon (1955), who 
proposed a simple model producing scale-free distribu- 
tions §|||. 

Th e Si mon model, can be formulated in the following 
way ]l72 ] . 

Individuals are divided to groups. 

(i) At each increment of time, a new individual is added 
to the system. 

(ii) a) With probability p (Simon used the notation a), 
it establishes a new family; b) with the complementary 
probability 1 — p, it chooses at random some old individ- 
ual and joins its family. 

The rule (ii) b) simply means that new individuals are 
distributed among families with probability proportional 
to their sizes, similar to rules for preferential linking. The 
number of individuals, of course, equals t, and, at long 
times, the number of families is pt. Using the master 
equation approach (see Sec. IX B), and passing to the 
long-time limit it is possible to get the following station- 
ary equation for the distribution of the sizes of families, 

P{k) + (1 - p)[kP(k) -{k- l)P(k - 1)] = 5 k>1 . (160) 

Introdu cing 7 = 1 + 1/(1— p), we can write the solution 
of Eq. ( 160 ) in the form 



P(k) = (7- l)B(k,j) ^k^ 



(161) 
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where B{ ) is the beta-function. Therefore, the power-law 
distribution with exponent 7 naturally arises. Recently, 
the non-stati onar y distribution P(k, t) was also described 
analytically J|2l) (see also Ref. [§2§]). 

The Simon model was originally proposed without any 
relation to networks. However, it is possible to formulate 
the Simon model for networks in terms of vertices and, 



e.g., directed edges [ p.77 |. 

(i) At each increment of time, a new edge is added to 
the network. 

(ii) a) Also, with probability p a new vertex is added, 
and the target end of the new edge is attached to the 
vertex, b) With the complementary probability 1 — p, 
the target end of the new edge is attached to the target 
end of a randomly chosen old edge. 

Here, rule (ii) b) corresponds to the distribution of 
new edges among vertices with probability proportional 
to their in-degree. One should indicate some difference 
between the Simon model and the models of growing net- 
works with preferential linking. In the models of Sees. 
VIII , [X , and|x], one vertex was added at each time step. 
In the Simon model, at each increment of time, one indi- 
vidual (edge) is added, and the number of added families 
(vertices) is not fixed. Of course, this can not change the 
stationary distributions and the value of 7 exponent. The 
behavior of P(k,t) at long times (large network sizes) is 
also similar. 

In fact, both the original Simon model and the prefer- 
ential linking concept are based on a quite general prin- 
ciple — popularity is attractive. Popular objects (idols) 
attract more new fans than the unpopular ones. 

Nevertheless, one should note that the matter of inter- 
est of Simon was the one-particle distribution, whereas, 
for networks, this is only a small part of the great prob- 
lem: what is their topology? 



and c(w, t) is proportional to t at long t imes . Such a 
dependence on t of the coefficients in Eq. ( |l62| ) provides 
stationary distr ibutions at long times. In the differential 
form, Eq. (|l62| ) can be written as 



dwj (t) 
dt 



[n(t) - i]wi(t) +Atw(t) 



c{w(t),t) unit). 

(163) 



Interpretation of Eqs. ( |162| ) an d (163) in terms of the 
wealth distribution is quite obvious. In particular, the 
l ast term restricts the growth of wealth. It was shown 
[229, 230 1 that the average wealth w(t) approaches a fixed 
value w at long times, and these equations produce the 
following stationary distribution 



P{wi) oc exp 



A w 
D Wi 



(164) 



where the exponent of the power-law dependence is 7 = 
2 + A/D. Eq. ( pi] ) gives P(0) = 0. Note that the 
resulting distribution is independent of B and c{w, t). 

The main difference of the particular stochastic multi- 
plicative process described by E qs. ( 162) and (|l63|) from 



the models considered in Sees. |VIII , IX, |x|, and |XII B 
is the fixed number of the involved agents. Nevertheless, 
the outlined g ener al approach can be used for networks 
(e.g., see Ref. [185]). On the other hand, the results ob- 
tained for the d egree distributions of evolving networks 
(see Sees. VIII . IX, and |xj) may be interpreted, for ex- 
ample, in terms of the wealth distribution in evolving 
societies. 



XIII. CONCLUDING REMARKS 



C. Multiplicative stochastic models and the 
generalized Lotka-Volterra equation 

One may look at the models of the network growth un- 
der mechanism of preferential linking from another point 
of view. The variation (increase) of the degree of a vertex 
is proportional to the degree of this vertex. This allows 
us to relate such models to the wide class of multiplica- 
tive stochastic processes. Last time, these processes are 
intensi vely stud ied in econophysics and evolutionary bi- 
ology [223-229]. In particular, they are used for the de- 
scription of wealth distribution. The most widely known 
example is the generalized Lotka-Volterra equation [ p23[ , 

Wi(t + 1) = ri(t)<Wi(t) + Atw(t) - c(w(t), t)wi(t) . (162) 

Here, Wi(t) may be interpreted as the wealth of agent i, 

i = 1, . . . ,N, w(t) = J2iLi W i{t)/N is the average wealth 
at time t. The distribution of the random noise ri(t) is 
independent oft. Its average value is (ri(t)} = Bt and the 
standard deviation equals Dt. A is a positive constant, 



The progress in this field is incredibly rapid, and we 
have failed to discuss and even cite a number of recent re- 
sults. In particular, we have missed r andom, i ntentional, 
and adaptive walks on networks 6s , 231 - 234 ] which are 
closely related to problems of the organization of effec- 
tive search in communication networks. We didn't dis- 
cuss the distribution of the number of the shortest paths 
passing through a vertex, which has a power-law form 
in scale- free networks [235,236 , and many other inter- 



237-251]). After this 



esting problems (see, e.g., Refs. 
review had been submitted, we learned about the review 
of Albert and Barabasi on the statistical mechanics of 
networks under preparation [252]. We call the reader's 
attention to this paper. 

Most studies which we reviewed focused on structural 
properties of growing networks. Two aspects of the prob- 
lem can be pointed out. 

(i) Specific mechanisms of the network growth produce 
their structure and, in particular, the degree distributions 
of their vertices. We demonstrated that the preferential 
linking mechanism (preferential attachment of new edges 
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to vertices with a higher number of connections) pro- 
vides degree distributions with long tails. Such nets are 
abundant in Nature. In particular, the communications 
networks have degree distributions just of this kind. The 
preferential linking is the reason of the self- organization 
of a number of growing networks into scale-free struc- 
tures. 

(ii) The resulting networks with such long-tailed dis- 
tributions have quite different properties than classical 
random graphs with the Poisson degree distribution. In 
particular, they may be extremely resilient to random 
damage. This substantial property partly explains their 
abundance in Nature. From the other hand, diseases may 
freely spread within these nets. The global topology of 
such networks is described by a theory that actually gen- 
eralizes the standard percolation theory. This theory is 
based on the assumption of statistical independence of 
vertices of an equilibrium network. In such an event, the 
joint in- and out-degree distribution P{ki, k a ) completely 
determines the structure of the network. This is one of 
the reasons why the knowledge of the degree distribution 
is so important. Despite the evident success of this ap- 
proach, one can see that its basic assumptions are not 
valid for non- equilibrium, growing networks. 

Keeping in mind most intriguing applications to the 
evolving communications networks, we have to admit 
that, currently, the most of the discussed models and 
ideas can be applied to the real networks only on a 
schematic, qualitative level. These simple models are still 
far from reality and only address particular phenomena 
in real networks. 

The title of the seminal paper of Erdos and Renyi 
(1960) was "On the evolution of random graphs" p8[ . 
What Erdos and Renyi called "random graphs" were 
simple equilibrium graphs with the Poisson degree dis- 
tribution. What they called "evolution" was actually the 
construction procedure for these graphs. Main recent 
achievements in theory of networks are related with tran- 
sition to study of the evolving, self-organizing networks 
with non-Poisson degree distributions. The fast progress 
in this field, in particular, means a very significant step 
toward understanding the most exciting networks of our 
World, the Internet, the WWW, and basic biological net- 
works. 
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